Microorganisms and methods for producing sialylated and N-acetylglucosamine-containing oligosaccharides

ABSTRACT

The invention provides compositions and methods for engineering bacteria to produce sialylated and N-acetylglucosamine-containing oligosaccharides, and the use thereof in the prevention or treatment of infection.

RELATED APPLICATIONS

This application is a national stage application, filed under 35 U.S.C.§371, of International Application No. PCT/US2014/029804, filed on Mar.14, 2014, which claims benefit of, and priority to, U.S. Ser. No.61/782,999, filed on Mar. 14, 2013; the contents of which are herebyincorporated by reference in its entirety.

INCORPORATION-BY-REFERENCE OF SEQUENCE LISTING

The contents of the text file named “37847_512001WO_ST25.txt”, which wascreated on Jun. 9, 2014, and is 144 KB in size, are hereby incorporatedby reference in their entireties.

FIELD OF THE INVENTION

The invention provides compositions and methods for producing purifiedoligosaccharides, in particular certain N-acetylglucosamine-containingand/or sialylated oligosaccharides that are typically found in humanmilk.

BACKGROUND OF THE INVENTION

Human milk contains a diverse and abundant set of neutral and acidicoligosaccharides (human milk oligosaccharides, hMOS). Many of thesemolecules are not utilized directly by infants for nutrition, but theynevertheless serve critical roles in the establishment of a healthy gutmicrobiome, in the prevention of disease, and in immune function. Priorto the invention described herein, the ability to produce hMOSinexpensively at large scale was problematic. For example, hMOSproduction through chemical synthesis was limited by stereo-specificityissues, precursor availability, product impurities, and high overallcost. As such, there is a pressing need for new strategies toinexpensively manufacture large quantities of hMOS for a variety ofcommercial applications.

SUMMARY OF THE INVENTION

The invention described herein features efficient and economical methodsfor producing N-acetylglucosamine-containing and/or sialylatedoligosaccharides.

The invention provides a method for producing anN-acetylglucosamine-containing oligosaccharide in a bacterium comprisingthe following steps: providing a bacterium that comprises an exogenousUDP-GlcNAc:Galα/β-R β 3-N-acetylglucosaminyltransferase and a functionallactose permease; and culturing the bacterium in the presence oflactose. The N-acetylglucosamine-containing oligosaccharide is thenretrieved from the bacterium or from a culture supernatant of thebacterium.

The invention further provides a method for producing a sialylatedoligosaccharide in a bacterium comprising the following steps: providinga bacterium that comprises an exogenous sialyl-transferase gene, adeficient sialic acid catabolic pathway, a sialic acid syntheticcapability, and a functional lactose permease gene; and culturing thebacterium in the presence of lactose. The sialylated oligosaccharide isthen retrieved from the bacterium or from a culture supernatant of thebacterium. Specifically, a sialic acid synthetic capability comprisesexpressing exogenous CMP-Neu5Ac synthetase, an exogenous sialic acidsynthase, and an exogenous UDP-GlcNAc-2-epimerase, or a functionalvariant or fragment thereof.

In both methods for producing N-acetylglucosamine-containing and/orsialylated oligosaccharides, it is preferable that the bacterium furthercomprises the capability for increased UDP-GlcNAc production. By“increased production capability” is meant that the host bacteriumproduces greater than 10%, 20%, 50%, 100%, 2-fold, 5-fold, 10-fold, ormore of a product than the native, endogenous bacterium. Preferably, thebacterium over-expresses a positive endogenous regulator of UDP-GlcNAcsynthesis. For example, the bacterium overexpresses the nagC gene ofEscherichia coli. Alternatively, the bacterium over-expresses theEscherichia coli glmS (L-glutamine:D-fructose-6-phosphateaminotransferase) gene, or alternatively, over-expresses the Escherichiacoli glmY gene (a positive translational regulator of glmS), or,alternatively over-expresses the Escherichia coli glmZ gene (anotherpositive translational regulator of glmS: glmY and glmZ are described inReichenbach et al Nucleic Acids Res 36, 2570-80 (2008)). Alternatively,the bacterium over-expresses any combination of such approaches. Forexample, the bacterium over-expresses nagC and glmS. Alternatively, thebacterium over-expresses nagC and glmY. Alternatively, the bacteriumover-expresses nagC and glmZ. The methods also further encompassover-expressing any functional variant or fragment of nagC, glmS, glmYand glmZ and any combination thereof. By “overexpression” is meant thatthe gene transcript or encoded gene product is 10%, 20%, 50%, 2-fold,5-fold, 10-fold, or more than the level expressed or produced by thecorresponding native, naturally-occurring, or endogenous gene.

The invention described herein details the manipulation of genes andpathways within bacteria such as the enterobacterium Escherichia coliK12 (E. coli) leading to high level synthesis of hMOS. Other strains ofE. coli for suitable for use in the present invention include E. coliMG1655, E. coli W3110, E. coli DH5aE, E. coli B, E. coli C, and E. coliW. A variety of bacterial species are suitable for use in theoligosaccharide biosynthesis methods, for example Erwinia herbicola(Pantoea agglomerans), Citrobacter freundii, Pantoea citrea,Pectobacterium carotovorum, or Xanthomonas campestris. Bacteria of thegenus Bacillus are suitable for use, including Bacillus subtilis,Bacillus licheniformis, Bacillus coagulans, Bacillus thermophilus,Bacillus laterosporus, Bacillus megaterium, Bacillus mycoides, Bacilluspumilus, Bacillus lentus, Bacillus cereus, and Bacillus circulans.Similarly, bacteria of the genera Lactobacillus and Lactococcus aremodified using the methods of this invention, including but not limitedto Lactobacillus acidophilus, Lactobacillus salivarius, Lactobacillusplantarum, Lactobacillus helveticus, Lactobacillus delbrueckii,Lactobacillus rhamnosus, Lactobacillus bulgaricus, Lactobacilluscrispatus, Lactobacillus gasseri, Lactobacillus casei, Lactobacillusreuteri, Lactobacillus jensenii, and Lactococcus lactis. Streptococcusthermophiles and Proprionibacterium freudenreichii are also suitablebacterial species for the invention described herein. Also included aspart of this invention are strains, modified as described here, from thegenera Enterococcus (e.g., Enterococcus faecium and Enterococcusthermophiles), Bacteroides (e.g., Bacteroides caccae, Bacteroidescellulosilyticus, Bacteroides dorei, Bacteroides eggerthii, Bacteroidesfinegoldii, Bacteroides fragilis, Bacteroides nordii, Bacteroidesovatus, Bacteroides salyersiae, Bacteroides thetaiotaomicron,Bacteroides uniformis, Bacteroides vulgatus and Bacteroidesxylanisolvens), Bifidobacterium (e.g., Bifidobacterium longum,Bifidobacterium infantis, and Bifidobacterium bifidum), Parabacteroides(e.g. Parabacteroides distasonis, Parabacteroides goldsteinii,Parabacteroides johnsonii and Parabacteroides merdae), Prevotella (e.g.,Prevotella copri), Sporolactobacillus spp., Micromomospora spp.,Micrococcus spp., Rhodococcus spp., and Pseudomonas (e.g., Pseudomonasfluorescens and Pseudomonas aeruginosa). Bacteria comprising thecharacteristics described herein are cultured in the presence oflactose, and an N-acetylglucosamine-containing or sialylatedoligosaccharide is retrieved, either from the bacterium itself or from aculture supernatant of the bacterium. The N-acetylglucosamine-containingor sialylated oligosaccharide is purified for use in therapeutic ornutritional products, or the bacteria are used directly in suchproducts.

The bacterium comprises a deleted or inactivated (i.e., non-functional)endogenous β-galactosidase gene. For example, the β-galactosidase genecomprises an E. coli lacZ gene (e.g., GenBank Accession Number V00296.1(GI:41901), incorporated herein by reference). The endogenous lacZ geneof the E. coli is deleted or functionally inactivated, but in such a waythat expression of the downstream lactose permease (lacY) gene remainsintact, i.e. a functional lactose permease gene is also present in thebacterium. By deleted is meant that a portion or the whole codingsequence is absent, such that no gene product is produced. An“inactivated” gene does not produce a gene product that functions as thenative, naturally-occuring, or endogenous gene. For example, thefunctional activity of an inactivated β-galactosidase gene product isreduced to 10%, 20%, 50%, or 100%, 1-fold, 2-fold, 5-fold, or 10-foldless than the functional activity of the native, naturally-occurring,endogenous gene product.

The lactose permease gene is an endogenous lactose permease gene or anexogenous lactose permease gene. For example, the lactose permease genecomprises an E. coli lacY gene (e.g., GenBank Accession Number V00295.1(GI:41897), incorporated herein by reference). Many bacteria possess theinherent ability to transport lactose from the growth medium into thecell, by utilizing a transport protein that is either a homolog of theE. coli lactose permease (e.g., as found in Bacillus licheniformis), ora transporter that is a member of the ubiquitous PTS sugar transportfamily (e.g., as found in Lactobacillus casei and Lactobacillusrhamnosus). For bacteria lacking an inherent ability to transportextracellular lactose into the cell cytoplasm, this ability is conferredby an exogenous lactose transporter gene (e.g., E. coli lacY) providedon recombinant DNA constructs, and supplied either on a plasmidexpression vector or as exogenous genes integrated into the hostchromosome.

For the production of N-acetylglucosamine-containing oligosaccharides,the bacterium comprises an exogenous UDP-GlcNAc:Galα/β-R β3-N-acetylglucosaminyltransferase gene or a functional variant orfragment thereof. This exogenous UDP-GlcNAc:Galα/β-R β3-N-acetylglucosaminyltransferase gene is obtained from any one of anumber of sources, e.g., the LgtA gene described from N. meningitides(SEQ ID NO:16 Genbank protein Accession AAF42258.1, incorporated hereinby reference) or N. gonorrhoeae (Genbank protein Accession ACF31229.1).Optionally, an additional exogenous glycosyltransferase gene isco-expressed in the bacterium comprising an exogenousUDP-GlcNAc:Galα/β-R β 3-N-acetylglucosaminyltransferase. For example, aβ-1,4-galactosyltransferase gene is co-expressed with theUDP-GlcNAc:Galα/β-R β3-N-acetylglucosaminyltransferase gene. Thisexogenous β-1,4-galactosyltransferase gene is obtained from any one of anumber of sources, e.g., that described from N. meningitidis, the LgtBgene (Genbank protein Accession AAF42257.1), or from H. pylori, theLex2B gene (SEQ ID NO:17 Genbank protein Accession NP_207619.1,incorporated herein by reference). Optionally, the additional exogenousglycosyltransferase gene co-expressed in the bacterium comprising anexogenous UDP-GlcNAc:Galα/β-R β 3-N-acetylglucosaminyltransferase geneis a β-1,3-galactosyltransferase gene, e.g., that described from E. coliO55:H7, the WbgO gene (SEQ ID NO:18 Genbank protein AccessionYP_003500090.1, incorporated herein by reference), or from H. pylori,the jhp0563 gene (Genbank protein Accession AEZ55696.1). Functionalvariants and fragments of any of the enzymes described above are alsoencompassed by the present invention.

In one embodiment, the N-acteylglucosamine-containing oligosaccharidesproduced by the methods described herein include Lacto-N-triose 2(LNT2), Lacto-N-tetraose (LNT), Lacto-N-neotetraose (LNnT),Lacto-N-fucopentaose I (LNF I), Lacto-N-fucopentaose II (LNF II),Lacto-N-fucopentaose III (LNF III), Lacto-N-fucopentaose V (LNF V),Lacto-N-difucohexaose I (LDFH I), Lacto-N-difucohexaose II (LDFH II),and Lacto-N-neodifucohexaose II (LFNnDFH II).

For the production of sialyl-oligosaccharides, the bacterium comprisesan exogenous sialyl-transferase gene. For example, the exogenoussialyl-transferase gene encodes α(2,3) sialyl-transferase or theexogenous sialyl-transferase gene encodes α(2,6) sialyl-transferase orthe exogenous sialyl-transferase gene encodes α(2,8) sialyltransferase.The exogenous sialyl-transferase genes is obtained from any one of anumber of sources, e.g., those described from N. meningitidis, N.gonorrhoeae, and from a number of organisms of the genus Photobacterium.Examples of α(2,8) sialyltransferases, useful for the production ofpolysialic acid for example, are found in Campylobacter jejuni (CstII:ADN52706) and Neisseria meningitides (or siaD:AAA20478).

The bacteria used herein to produce hMOS are genetically engineered tocomprise an increased intracellular lactose pool (as compared to wildtype) and to comprise UDP-GlcNAc:Galα/β-R β3-N-acetylglucosaminyltransferase and/or sialyl-transferase activity.Optionally, they also comprise β-1,4-galactosyltransferase orβ-1,3-galactosyltransferase activity, and/or α-1,2-, α-1,3- and/orα-1,4-fucosyltransferase activity. In some cases, the bacterium furthercomprises a functional, wild-type E. coli lacZ⁺ gene inserted into anendogenous gene, for example, the lon gene in E. coli or the thyA genein E. coli. In this manner, the bacterium further comprises a mutationin a lon gene or a mutation in the thyA gene. In these cases, theendogenous lacZ gene of the E. coli is deleted or functionallyinactivated, but in such a way that expression of the downstream lactosepermease (lacY) gene remains intact. The organism so manipulatedmaintains the ability to transport lactose from the growth medium, andto develop an intracellular lactose pool for use as an acceptor sugar inoligosaccharide synthesis, while also maintaining a low level ofintracellular beta-galactosidase activity useful for a variety ofadditional purposes. For example, the invention also includes: a)methods for phenotypic marking of a gene locus in a β-galactosidasenegative host cell by utilizing a β-galactosidase (e.g., lacZ) geneinsert engineered to produce a low but readily detectable level ofβ-galactosidase activity, b) methods for readily detecting lyticbacteriophage contamination in fermentation runs through release anddetection of cytoplasmic β-galactosidase in the cell culture medium, andc) methods for depleting a bacterial culture of residual lactose at theend of production runs. a), b) and c) are each achieved by utilizing afunctional β-galactosidase (e.g., lacZ) gene insert carefully engineeredto direct the expression of a low, but detectable level ofβ-galactosidase activity in an otherwise β-galactosidase negative hostcell. The bacterium optionally further comprises a mutation in a lacAgene. Preferably, the bacterium accumulates an increased intracellularlactose pool, and produces a low level of beta-galactosidase. Anincreased intracellular pool is wherein the concentration of lactose inthe host bacterium at least 10%, 20%, 50%, 2-fold, 5-fold, or 10-foldhigher than that of the native, naturally-occurring bacterium.

In one aspect, the human milk oligosaccharide produced by engineeredbacteria comprising an exogenous nucleic acid molecule encoding anUDP-GlcNAc:Galα/β-R β 3-N-acetylglucosaminyltransferase and an exogenousnucleic acid encoding β-1,4-galactosyltransferase is lacto-N-neotetraose(LNnT). In another aspect, the human milk oligosaccharide produced byengineered bacteria comprising an exogenous nucleic acid moleculeencoding a UDP-GlcNAc:Galα/β-R β 3-N-acetylglucosaminyltransferase andan exogenous nucleic acid encoding β-1,3-galactosyltransferase islacto-N-tetraose (LNT).

Described herein are compositions comprising a bacterial cell thatproduces the human milk oligosaccharide LNnT (lacto-N-neotetraose),wherein the bacterial cell comprises an exogenous UDP-GlcNAc:Galα/β-R β3-N-acetylglucosaminyltransferase and an exogenous nucleic acid encodinga β-1,4-galactosyltransferase. Preferably, the bacterial cell is E.coli. The exogenous UDP-GlcNAc:Galα/β-R β3-N-acetylglucosaminyltransferase gene is obtained from any one of anumber of sources, e.g., the LgtA gene described from N. meningitides.The exogenous β-1,4-galactosyltransferase gene is obtained from any oneof a number of sources, e.g., that described from N. meningitidis, theLgtB gene, or from H. pylori, the jhp0765 gene.

Additionally, the bacterium preferably comprises increased production ofUDP-GlcNAc. An exemplary means to achieve this is by over-expression ofa positive endogenous regulator of UDP-GlcNAc synthesis, for example,overexpression of the nagC gene of Escherichia coli. In one aspect, thisnagC over-expression is achieved by providing additional copies of thenagC gene on a plasmid vector or by integrating additional nagC genecopies into the host cell chromosome. Alternatively, over-expression isachieved by modulating the strength of the ribosome binding sequencedirecting nagC translation or by modulating the strength of the promoterdirecting nagC transcription. As further alternatives the intracellularUDP-GlcNAc pool may be enhanced by other means, for example byover-expressing the Escherichia coli glmS(L-glutamine:D-fructose-6-phosphate aminotransferase) gene, oralternatively by over-expressing the Escherichia coli glmY gene (apositive translational regulator of glmS), or alternatively byover-expressing the Escherichia coli glmZ gene (another positivetranslational regulator of glmS), or alternatively by simultaneouslyusing a combination of approaches. In one preferred embodiment, forexample, the nagC (SEQ ID NO:19 Genbank protein Accession BAA35319.1,incorporated herein by reference) and glmS (SEQ ID NO:20 Genbank proteinAccession NP_418185.1, incorporated herein by reference) genes whichencode the sequences provided herein are overexpressed simultaneously inthe same host cell in order to increase the intracellular pool ofUDP-GlcNAc. Other components of UDP-GlcNAc metabolism include:(GlcNAc-1-P) N-acetylglucosamine-1-phosphate; (GlcN-1-P)glucosamine-1-phosphate; (GlcN-6-P) glucosamine-6-phosphate;(GlcNAc-6-P) N-acetylglucosamine-6-phosphate; and (Fruc-6-P)Fructose-6-phosphate. Bacteria comprising the characteristics describedherein are cultured in the presence of lactose, and lacto-N-neotetraoseis retrieved, either from the bacterium itself (i.e., by lysis) or froma culture supernatant of the bacterium.

Also within the invention is an isolated E. coli bacterium as describedabove and characterized as comprising a deleted or inactivatedendogenous β-galactosidase gene, an inactivated or deleted lacA gene,and a functional lactose permease (lacY) gene.

Also described herein are compositions comprising a bacterial cell thatproduces the human milk oligosaccharide 6′-SL (6′-sialyllactose),wherein the bacterial cell comprises an exogenous sialyl-transferasegene encoding α(2,6)sialyl-transferase. Preferably, the bacterial cellis E. coli. The exogenous sialyl-transferase gene utilized for 6′-SLproduction is obtained from any one of a number of sources, e.g., thosedescribed from a number of organisms of the genus Photobacterium. In yetanother aspect, the human milk oligosaccharide produced by engineeredbacteria comprising an exogenous nucleic acid molecule encoding anα(2,3) sialyltransferase is 3′-SL (3′-sialyllactose). The exogenoussialyltransferase gene utilized for 3′-SL production is obtained fromany one of a number of sources, e.g., those described from N.meningitidis and N. gonorrhoeae.

Additionally, the bacterium contains a deficient sialic acid catabolicpathway. By “sialic acid catabolic pathway” is meant a sequence ofreactions, usually controlled and catalyzed by enzymes, which results inthe degradation of sialic acid. An exemplary sialic acid catabolicpathway in Escherichia coli is described herein. In the sialic acidcatabolic pathway described herein, sialic acid (Neu5Ac;N-acetylneuraminic acid) is degraded by the enzymes NanA(N-acetylneuraminic acid lyase) and NanK (N-acetylmannosamine kinase)and NanE (N-acetylmannosamine-6-phosphate epimerase), all encoded in thenanATEK-yhcH operon, and repressed by NanR (ecocyc.org/ECOLI). Adeficient sialic acid catabolic pathway is engineered in Escherichiacoli by way of a mutation in endogenous nanA (N-acetylneuraminate lyase)(e.g., GenBank Accession Number D00067.1 (GI:216588), incorporatedherein by reference) and/or nanK (N-acetylmannosamine kinase) genes(e.g., GenBank Accession Number (amino acid) BAE77265.1 (GI:85676015),incorporated herein by reference), and/or nanE(N-acetylmannosamine-6-phosphate epimerase, GI: 947745, incorporatedherein by reference). Optionally, the nanT (N-acetylneuraminatetransporter) gene is also inactivated or mutated. Other intermediates ofsialic acid metabolism include: (ManNAc-6-P)N-acetylmannosamine-6-phosphate; (GlcNAc-6-P)N-acetylglucosamine-6-phosphate; (GlcN-6-P) Glucosamine-6-phosphate; and(Fruc-6-P) Fructose-6-phosphate. In some preferred embodiments, nanA ismutated. In other preferred embodiments, nanA and nanK are mutated,while nanE remains functional. In another preferred embodiment, nanA andnanE are mutated, while nanK has not been mutated, inactivated ordeleted. A mutation is one or more changes in the nucleic acid sequencecoding the gene product of nanA, nanK, nanE, and/or nanT. For example,the mutation may be 1, 2, 5, 10, 25, 50 or 100 changes in the nucleicacid sequence. For example, the nanA, nanK, nanE, and/or nanT is mutatedby a null mutation. Null mutations as described herein encompass aminoacid substitutions, additions, deletions, or insertions that eithercause a loss of function of the enzyme (i.e., reduced or no activity) orloss of the enzyme (i.e., no gene product). By deleted is meant that thecoding region is removed in whole or in part such that no gene productis produced. By inactivated is meant that the coding sequence has beenaltered such that the resulting gene product is functionally inactive orencodes a gene product with less than 100%, 80%, 50%, or 20% of theactivity of the native, naturally-occurring, endogenous gene product. A“not mutated” gene or protein does not differ from a native,naturally-occurring, or endogenous coding sequence by 1, 2, 5, 10, 20,50, 100, 200 or 500 more codons, or to the corresponding encoded aminoacid sequence.

Moreover, the bacterium (e.g., E. coli) also comprises a sialic acidsynthetic capability. For example, the bacterium comprises a sialic acidsynthetic capability through provision of an exogenous UDP-GlcNAc2-epimerase (e.g., neuC of Campylobacter jejuni (SEQ ID NO: 13, GenBankAAK91727.1; GI:15193223, incorporated herein by reference) or equivalent(e.g. E. coli S88 neuC GenBank YP_002392936.1; GI: 218560023), a Neu5Acsynthase (e.g., neuB of C. jejuni (SEQ ID NO:14 AAK91726.1GenBankGI:15193222, incorporated herein by reference) or equivalent, (e.g.Flavobacterium limnosediminis sialic acid synthase, GenBankGI:559220424), and/or a CMP-Neu5Ac synthetase (e.g., neuA of C. jejuni(SEQ ID NO: 15 GenBank AAK91728.1; GI:15193224, incorporated herein byreference) or equivalent, (e.g. Vibrio brasiliensis CMP-sialic acidsynthase, GenBank GI: 493937153). Functional variants and fragments arealso disclosed herein.

Additionally, the bacterium comprising a sialic acid syntheticcapability preferably increased production of UDP-GlcNAc. An exemplarymeans to achieve this is by over-expression of a positive endogenousregulator of UDP-GlcNAc synthesis, for example, simultaneousoverexpression of the nagC and glmS genes of Escherichia coli. This nagCand glmS over-expression is achieved by providing additional copies ofthe nagC and glmS genes on a plasmid vector, or by integratingadditional nagC and glmS gene copies into the host cell chromosome.Alternatively, over-expression is achieved by modulating the strength ofthe ribosome binding sequence directing nagC (described by Sleight etal, Nucleic Acids Res. May 2010; 38(8): 2624-2636) and/or glmStranslation, or by modulating the strength of the promoter/s directingnagC and glmS transcription (Sleight et al, Nucleic Acids Res. May 2010;38(8): 2624-2636)

Bacteria comprising the characteristics described herein are cultured inthe presence of lactose, and, in the instance where cells comprise anα(2,6) sialyltransferase (e.g. Photobacterium spp JT-ISH-224 (SEQ IDNO:21 Genbank protein Accession BAF92026.1, incorporated herein byreference), 6′-sialyllactose is retrieved, either from the bacteriumitself or from a culture supernatant of the bacterium. In the instancewhere cells comprise an α(2,3) sialyltransferase, (e.g. Neisseriameningitidis 1st (Genbank protein Accession NP273962.1) 3′-sialyllactoseis recovered either from the bacterium itself (e.g., by lysis of thebacterium) or from a culture supernatant of the bacterium.

Also within the invention is an isolated E. coli bacterium as describedabove and characterized as comprising a deleted or inactivatedendogenous β-galactosidase gene, an exogenous sialyl-transferase gene, adeficient sialic acid catabolic pathway, a sialic acid syntheticcapability, a deleted lacA gene, and a functional lactose permease(lacY) gene.

A purified N-acetylglucosamine-containing or sialylated oligosaccharideproduced by the methods described above is also within the invention. Apurified oligosaccharide, e.g., 6′-SL, is one that is at least 90%, 95%,98%, 99%, or 100% (w/w) of the desired oligosaccharide by weight. Purityis assessed by any known method, e.g., thin layer chromatography orother electrophoretic or chromatographic techniques known in the art.The invention includes a method of purifying anN-acetylglucosamine-containing or sialylated oligosaccharide produced bythe genetically engineered bacteria described above, which methodcomprises separating the desired N-acetylglucosamine-containing orsialylated oligosaccharide (e.g., 6′-SL) from contaminants in abacterial cell extract or lysate, or bacterial cell culture supernatant.Contaminants include bacterial DNA, protein and cell wall components,and yellow/brown sugar caramels sometimes formed in spontaneous chemicalreactions in the culture medium.

The oligosaccharides are purified and used in a number of products forconsumption by humans as well as animals, such as companion animals(dogs, cats) as well as livestock (bovine, equine, ovine, caprine, orporcine animals, as well as poultry). For example, a pharmaceuticalcomposition comprising purified 6′-sialyllactose (6′-SL) and anexcipient is suitable for oral administration. Large quantities of 6′-SLare produced in bacterial hosts, e.g., an E. coli bacterium comprising aheterologous sialyltransferase, e.g., a heterologousα(2,6)sialyltransferase. An E. coli bacterium comprising an enhancedcytoplasmic pool of each of the following: lactose and CMP-Neu5Ac, isuseful in such production systems. In the case of lactose, endogenous E.coli metabolic pathways and genes are manipulated in ways that result inthe generation of increased cytoplasmic concentrations of lactose, ascompared to levels found in wild type E. coli. For example, the bacteriacontain at least 10%, 20%, 50%, 2×, 5×, 10× or more of the levels in acorresponding wild type bacteria that lacks the genetic modificationsdescribed above. In the case of CMP-Neu5Ac, endogenous Neu5Ac catabolismgenes are inactivated and exogenous CMP-Neu5Ac biosynthesis genesintroduced into E. coli resulting in the generation of a cytoplasmicpool of CMP-Neu5Ac not found in the wild type bacterium.

A method of producing a pharmaceutical composition comprising a purifiedhMOS is carried out by culturing the bacterium described above,purifying the hMOS produced by the bacterium, and combining the hMOSwith an excipient or carrier to yield a dietary supplement for oraladministration. These compositions are useful in methods of preventingor treating enteric and/or respiratory diseases in infants and adults.Accordingly, the compositions are administered to a subject sufferingfrom or at risk of developing such a disease using known methods ofclinical therapy.

The invention also provides for increasing, in E. coli, theintracellular concentration of the nucleotide sugar uridine diphosphateN-acetylglucosamine (UDP-GlcNAc). This is achieved by over-expressingthe bi-functional endogenous positive regulator of UDP-GlcNac synthesisand repressor of glucosamine and N-acetylglucosamine catabolism, nagC,simultaneously with the gene encoding L-glutamine:D-fructose-6-phosphateaminotransferase, glmS.

The invention also provides for increasing the intracellularconcentration of lactose in E. coli, for cells grown in the presence oflactose, by using manipulations of endogenous E. coli genes involved inlactose import, export, and catabolism. In particular, described hereinare methods of increasing intracellular lactose levels in E. coligenetically engineered to produce a human milk oligosaccharide byincorporating a lacA mutation into the genetically modified E. coli. ThelacA mutation prevents the formation of intracellular acetyl-lactose,which not only removes this molecule as a contaminant from subsequentpurifications, but also eliminates E. coli's ability to export excesslactose from its cytoplasm, thus greatly facilitating purposefulmanipulations of the E. coli intracellular lactose pool.

Also described herein are bacterial host cells with the ability toaccumulate a intracellular lactose pool while simultaneously possessinglow, functional levels of cytoplasmic β-galactosidase activity, forexample as provided by the introduction of a functional recombinant E.coli lacZ gene, or by a β-galactosidase gene from any of a number ofother organisms (e.g., the lac4 gene of Kluyveromyces lactis (e.g.,GenBank Accession Number M84410.1 (GI:173304), incorporated herein byreference). Low, functional levels of cytoplasmic β-galactosidaseinclude β-galactosidase activity levels of between 0.05 and 200 units,e.g., between 0.05 and 5 units, between 0.05 and 4 units, between 0.05and 3 units, or between 0.05 and 2 units (for standard definition see:Miller J H, Laboratory CSH. Experiments in molecular genetics. ColdSpring Harbor Laboratory Cold Spring Harbor, N.Y.; 1972; incorporatedherein by reference). This low level of cytoplasmic β-galactosidaseactivity, while not high enough to significantly diminish theintracellular lactose pool, is nevertheless very useful for tasks suchas phenotypic marking of desirable genetic loci during construction ofhost cell backgrounds, for detection of cell lysis due to undesiredbacteriophage contaminations in fermentation processes, for the facileremoval of undesired residual lactose at the end of fermentations, orfor in-process fermentation QC purposes (i.e. as a non-standardphenotype the provision of a weak lacZ phenotype aids in culture purityassessments).

Methods of purifying a N-acetylglucosamine-containing or sialylatedoligosaccharide produced by the methods described herein are carried outby binding the oligosaccharide from a bacterial cell lysate or bacterialcell culture supernatant of the bacterium to a carbon column, andsubsequently eluting it from the column. PurifiedN-acetylglucosamine-containing or sialylated oligosaccharides areproduced by the methods described herein.

Optionally, the invention features a vector, e.g., a vector containing anucleic acid. The vector can further include one or more regulatoryelements, e.g., a heterologous promoter. The regulatory elements can beoperably linked to a protein gene, fusion protein gene, or a series ofgenes linked in an operon in order to express the fusion protein. Tomaintain the plasmid vector stably within the cell a selectable markeris included within its sequence, such as an antibiotic resistance geneor a gene that complements a nutritional auxotrophy of the hostbacterium. For example, in E. coli, a thymidine deficiency caused by achromosomal defect in the thymidylate synthase gene (thyA) can becomplemented by a plasmid borne wild type copy of the thyA (M. Belfort,G. F. Maley, F. Maley, Proceedings of the National Academy of Sciences80, 1858 (1983)) gene. Alternatively an adenine deficiency caused by achromosomal deficiency in the adenylosuccinate synthetase (purA) gene(S. A. Wolfe, J. M. Smith, J Biol Chem 263, 19147-53 (1988)) can becomplemented by a plasmid borne wild type copy of purA. Two plasmidvectors may be utilized simultaneously within the same bacterial cell byemploying separate selectable markers, for example one plasmid utilizingthyA selection and one utilizing purA selection, and by utilizing twocompatible plasmid replicons, for example in E. coli two such compatiblereplicons comprise the ColE1 (pUC) replicon and the p15A (pACYC)replicon (R. E. Bird, J Bacteriol 145, 1305-9 (1981)). In yet anotheraspect, the invention comprises an isolated recombinant cell, e.g., abacterial cell containing aforementioned nucleic acid molecule/s orvector/s. The nucleic acid sequences can be optionally integrated intothe genome.

The invention provides a method of treating, preventing, or reducing therisk of infection in a subject comprising administering to said subjecta composition comprising a human milk oligosaccharide, purified from aculture of a recombinant strain of the current invention, wherein thehMOS binds to a pathogen and wherein the subject is infected with or atrisk of infection with the pathogen. In one aspect, the infection iscaused by a Norwalk-like virus or Campylobacter jejuni. The subject ispreferably a mammal in need of such treatment. The mammal is, e.g., anymammal, e.g., a human, a primate, a mouse, a rat, a dog, a cat, a cow, ahorse, or a pig. In a preferred embodiment, the mammal is a human. Forexample, the compositions are formulated into animal feed (e.g.,pellets, kibble, mash) or animal food supplements for companion animals,e.g., dogs or cats, as well as livestock or animals grown for foodconsumption, e.g., cattle, sheep, pigs, chickens, and goats. Preferably,the purified hMOS is formulated into a powder (e.g., infant formulapowder or adult nutritional supplement powder, each of which is mixedwith a liquid such as water or juice prior to consumption) or in theform of tablets, capsules or pastes or is incorporated as a component indairy products such as milk, cream, cheese, yogurt or kefir, or as acomponent in any beverage, or combined in a preparation containing livemicrobial cultures intended to serve as probiotics, or in prebioticpreparations intended to enhance the growth of beneficial microorganismseither in vitro or in vivo. For example, the purified sugar (e.g., LNnTor 6′-SL) can be mixed with a Bifidobacterium or Lactobacillus in aprobiotic nutritional composition. (i.e. Bifidobacteria are beneficialcomponents of a normal human gut flora and are also known to utilizehMOS for growth.

All genes described herein also include a description of thecorresponding encoded gene products. As such, the uses of exogenousgenes as described herein encompass nucleic acids that encode the geneproduct sequences disclosed herein. The person skilled in the art couldreadily generate nucleic acid sequences that encode the proteinsequences described herein and introduce such sequences into expressionvectors to carry out the present invention.

The term “substantially pure” in reference to a given polypeptide,polynucleotide or oligosaccharide means that the polypeptide,polynucleotide or oligosaccharide is substantially free from otherbiological macromolecules. The substantially pure polypeptide,polynucleotide or oligosaccharide is at least 75% (e.g., at least 80,85, 95, or 99%) pure by dry weight. Purity can be measured by anyappropriate calibrated standard method, for example, by columnchromatography, polyacrylamide gel electrophoresis, thin layerchromatography (TLC) or HPLC analysis.

Polynucleotides, polypeptides, and oligosaccharides of the invention arepurified and/or isolated. Purified defines a degree of sterility that issafe for administration to a human subject, e.g., lacking infectious ortoxic agents. Specifically, as used herein, an “isolated” or “purified”nucleic acid molecule, polynucleotide, polypeptide, protein oroligosaccharide, is substantially free of other cellular material, orculture medium when produced by recombinant techniques, or chemicalprecursors or other chemicals when chemically synthesized. For example,purified hMOS compositions are at least 60% by weight (dry weight) thecompound of interest. Preferably, the preparation is at least 75%, morepreferably at least 90%, and most preferably at least 99%, by weight thecompound of interest. Purity is measured by any appropriate calibratedstandard method, for example, by column chromatography, polyacrylamidegel electrophoresis, thin layer chromatography (TLC) or HPLC analysis.For example, a “purified protein” refers to a protein that has beenseparated from other proteins, lipids, and nucleic acids with which itis naturally associated. Preferably, the protein constitutes at least10, 20, 50 70, 80, 90, 95, 99-100% by dry weight of the purifiedpreparation.

By “isolated nucleic acid” is meant a nucleic acid that is free of thegenes that flank it in the naturally-occurring genome of the organismfrom which the nucleic acid is derived. The term covers, for example:(a) a DNA which is part of a naturally occurring genomic DNA molecule,but is not flanked by both of the nucleic acid sequences that flank thatpart of the molecule in the genome of the organism in which it naturallyoccurs; (b) a nucleic acid incorporated into a vector or into thegenomic DNA of a prokaryote or eukaryote in a manner, such that theresulting molecule is not identical to any naturally occurring vector orgenomic DNA; (c) a separate molecule such as a cDNA, a genomic fragment,a fragment produced by polymerase chain reaction (PCR), or a restrictionfragment; and (d) a recombinant nucleotide sequence that is part of ahybrid gene, i.e., a gene encoding a fusion protein. Isolated nucleicacid molecules according to the present invention further includemolecules produced synthetically, as well as any nucleic acids that havebeen altered chemically and/or that have modified backbones. Forexample, the isolated nucleic acid is a purified cDNA or RNApolynucleotide.

A “heterologous promoter”, when operably linked to a nucleic acidsequence, refers to a promoter which is not naturally associated withthe nucleic acid sequence.

The term “over-express” as used herein refers to gene transcript orencoded gene product is 10%, 20%, 50%, 2-fold, 5-fold, 10-fold, or morethan the level expressed or produced by a native, naturally-occurring,or endogenous gene in a bacterium in which it naturally occurs. Forexample, the host bacterium described herein are engineered toover-express an exogenous gene transcript or encoded gene product ofUDP-GlcNAc:Galα/β-R β3-N-acetylglucosaminyltransferase, nagC, glmS,glmY, glmZ, a sialyl-transferase, a β-galactosyltransferase, anα-fucosyltransferase, CMP-Neu5Ac synthetase, a sialic acid synthase, ora UDP-GlcNAc 2-epimerase, i.e., a gene or gene product with a sequencecorresponding to that of a bacterium other than the host bacterium.

The terms “treating” and “treatment” as used herein refer to theadministration of an agent or formulation to a clinically symptomaticindividual afflicted with an adverse condition, disorder, or disease, soas to effect a reduction in severity and/or frequency of symptoms,eliminate the symptoms and/or their underlying cause, and/or facilitateimprovement or remediation of damage. The terms “preventing” and“prevention” refer to the administration of an agent or composition to aclinically asymptomatic individual who is susceptible to a particularadverse condition, disorder, or disease, and thus relates to theprevention of the occurrence of symptoms and/or their underlying cause.

By the terms “effective amount” and “therapeutically effective amount”of a formulation or formulation component is meant a nontoxic butsufficient amount of the formulation or component to provide the desiredeffect.

The transitional term “comprising,” which is synonymous with“including,” “containing,” or “characterized by,” is inclusive oropen-ended and does not exclude additional, unrecited elements or methodsteps. By contrast, the transitional phrase “consisting of” excludes anyelement, step, or ingredient not specified in the claim. Thetransitional phrase “consisting essentially of” limits the scope of aclaim to the specified materials or steps “and those that do notmaterially affect the basic and novel characteristic(s)” of the claimedinvention.

Other features and advantages of the invention will be apparent from thefollowing description of the preferred embodiments thereof, and from theclaims. Unless otherwise defined, all technical and scientific termsused herein have the same meaning as commonly understood by one ofordinary skill in the art to which this invention belongs. Althoughmethods and materials similar or equivalent to those described hereincan be used in the practice or testing of the present invention,suitable methods and materials are described below. All publishedforeign patents and patent applications cited herein are incorporatedherein by reference. Genbank and NCBI submissions indicated by accessionnumber cited herein are incorporated herein by reference. All otherpublished references, documents, manuscripts and scientific literaturecited herein are incorporated herein by reference. In the case ofconflict, the present specification, including definitions, willcontrol. In addition, the materials, methods, and examples areillustrative only and not intended to be limiting.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a schematic demonstrating metabolic pathways and the changesintroduced into them to engineer 2′-fucosyllactose (2′-FL) synthesis inEscherichia coli (E. coli). Specifically, the lactose synthesis pathwayand the GDP-fucose synthesis pathway are illustrated. In the GDP-fucosesynthesis pathway: manA=phosphomannose isomerase (PMI),manB=phosphomannomutase (PMM), manC=mannose-1-phosphateguanylyltransferase (GMP), gmd=GDP-mannose-4,6-dehydratase,fcl=GDP-fucose synthase (GFS), and ΔwcaJ=mutated UDP-glucose lipidcarrier transferase.

FIG. 2 is a schematic demonstrating metabolic pathways involved in thesynthesis of UDP-GlcNAc (uridine diphosphate N-acetylglucosamine) andcatabolism of glucosamine and N-acetylglucosamine in E. coli. In theschematic: (GlcNAc-1-P) N-acetylglucosamine-1-phosphate; (GlcN-1-P)glucosamine-1-phosphate; (GlcN-6-P) glucosamine-6-phosphate;(GlcNAc-6-P) N-acetylglucosamine-6-phosphate; and (Fruc-6-P)Fructose-6-phosphate; glmS (L-glutamine:D-fructose-6-phosphateaminotransferase), glmM (phosphoglucosamine mutase), glmU (fusedN-acetyl glucosamine-1-phosphate uridyltransferase andglucosamine-1-phosphate acetyl transferase), nagC (bifunctionaltranscriptional activator/repressor protein), nagA(N-acetylglucosamine-6-phosphate deacetylase) and nagB(glucosamine-6-phosphate deaminase), nagE (N-acetylglucosaminetransporter] and manXYZ [glucosamine transporter).

FIG. 3 is a schematic demonstrating metabolic pathways and one example(utilizing nanT, nanA and nanK deletions) of the changes introduced intothem to engineer 6′-sialyllactose (6′-SL) synthesis in E. coli.Abbreviations include: (Neu5Ac) N-acetylneuraminic acid, sialic acid;(ΔnanT) mutated N-acetylneuraminic acid transporter; (ΔnanA) mutatedN-acetylneuraminic acid lyase; (ManNAc) N-acetylmannosamine; (ΔnanK)mutated N-acetylmannosamine kinase; (nanE) wild-typeN-acetylmannosamine-6-phosphate epimerase; (ManNAc-6-P)N-acetylmannosamine-6-phosphate; (GlcNAc-6-P)N-acetylglucosamine-6-phosphate; (GlcN-6-P) Glucosamine-6-phosphate;(Fruc-6-P) Fructose-6-phosphate; (neuA), CMP-N-acetylneuraminic acidsynthetase; (CMP-Neu5Ac) CMP-N-acetylneuraminic acid; (neuB),N-acetylneuraminic acid synthase; (neuC) UDP-GlcNAc-2-epimerase; and(UDP-GlcNAc) uridine diphosphate N-acetylglucosamine.

FIG. 4 is a schematic that illustrates the new configuration of genesengineered at the Escherichia coli thyA locus in strains used to produceN-acetylglucosamine-containing oligosaccharides.

FIG. 5 is a plasmid map of pG292, which expresses the N. meningitidisβ(1,3)-N-acetylglucosaminyltransferase gene lgtA.

FIG. 6 is a plasmid map of pG221, which expresses, as an operon, the N.meningitidis β(1,3)-N-acetylglucosaminyltransferase gene lgtA and the E.coli O55:H7 wbgO β(1,3)-galactosyltransferase gene.

FIG. 7 is a plasmid map of pG222, which expresses, as an operon, the N.meningitidis β(1,3)-N-acetylglucosaminyltransferase gene lgtA and the H.pylori 4GalT (jhp0765) β(1,4)-galactosyltransferase gene.

FIG. 8 illustrates schematically the enzymatic reactions necessary toproduce from lactose, via the intermediate trisaccharide lacto-N-triose2 (LNT2), the two human milk oligosaccharides: Lacto-N-tetraose (LNT)and Lacto-N-neotetraose (LNnT). A thin layer chromatogram (on left) ispresented of culture medium samples taken from small scale E. colicultures and demonstrating synthesis of LNT2, LNT and LNnT. A secondthin layer chromatogram (on right) is presented of culture mediumsamples taken from a 15 L E. coli bioreactor culture-demonstratingsynthesis of LNnT.

FIG. 9 is a plasmid map of pG317, a low-copy vector which expresses asan operon, under the control of the E. coli lac promoter, theCampylobacter jejuni ATCC43438 neuB, neuC and neuA genes, encodingN-acetylneuraminate synthase, UDP-N-acetylglucosamine 2-epimerase, andN-acetylneuraminate cytidylyltransferase, respectively.

FIG. 10 is a plasmid map of pG315, a multi-copy vector which expresses agene encoding an α(2,6) sialyltransferase from Photobacterium sppJT-ISH-224, under the control of the E. coli lac promoter.

FIG. 11 is a photograph of a thin layer chromatogram showing 6′-SL inculture medium produced by E. coli strain E547 (ΔnanRATEK), containingplasmids expressing a bacterial α(2,3)sialyltransferase and neuA, neuBand neuC. FIG. 11 also shows a TLC analysis of culture supernatants fromtwo fermentations producing 6′-sialylactose (6′-SL). Samples to the leftof the figure are taken from a fermentation of an E. coli straincontaining pG315 (carrying a strong RBS in front of theα(2,6)sialyltransferase gene in the vector). Samples on the right of thefigure are taken from a fermentation of an E. coli strain containing aclose variant of pG315 that carries a weaker RBS in front of theα(2,6)sialyltransferase gene.

FIG. 12 is a plasmid map of pG345, a multi-copy vector which expresses agene encoding an α(2,6) sialyltransferase from Photobacterium sppJT-ISH-224, under the control of a weaker ribosomal binding site (SEQ IDNO:8) and the E. coli lac promoter.

FIG. 13 is a schematic demonstrating metabolic pathways and a secondexample (utilizing nanT, nanA and nanE deletions) of the changesintroduced into them to engineer 6′-sialyllactose (6′-SL) synthesis inE. coli. Abbreviations include: (Neu5Ac) N-acetylneuraminic acid, sialicacid; (ΔnanT) mutated N-acetylneuraminic acid transporter; (ΔnanA)mutated N-acetylneuraminic acid lyase; (ManNAc) N-acetylmannosamine;(nanK) wild-type N-acetylmannosamine kinase; (ΔnanE) mutatedN-acetylmannosamine-6-phosphate epimerase; (ManNAc-6-P)N-acetylmannosamine-6-phosphate; (GlcNAc-6-P)N-acetylglucosamine-6-phosphate; (GlcN-6-P) Glucosamine-6-phosphate;(Fruc-6-P) Fructose-6-phosphate; (neuA), CMP-N-acetylneuraminic acidsynthetase; (CMP-Neu5Ac) CMP-N-acetylneuraminic acid; (neuB),N-acetylneuraminic acid synthase; (neuC) UDP-GlcNAc-2-epimerase; and(UDP-GlcNAc) uridine diphosphate N-acetylglucosamine.

FIG. 14 illustrates the TLC analysis of cell pellets and or supernatantsfrom a three pilot scale fermentation experiments using three E. colistrains carrying various combinations of nan mutations

FIG. 15 is a schematic illustrating the location of the gene deletionmade within the E. coli nan operon to generate the [nanR+, nanA, nanT,nanE, nanK+] mutant locus of strains E1017 and E1018.

FIG. 16 is a cell density growth curve plot of four cultures of E680transformed with pG292, induced or un-induced by tryptophan addition,and in the presence or absence of lactose in the growth medium. Abundantcell lysis is seen in the lactose-containing cultures.

FIG. 17 is a plasmid map of pG356, which expresses, as an operon, the E.coli glmS and nagC genes. pG356 carries a p15A replication origin andboth ampC and purA selectable markers.

FIG. 18 is a fementation parameter trace and TLC culture supernatantanalysis (for LNnT production) of a 1.5 L bioreactor culture of E796transformed with pG222.

FIG. 19 is a fementation parameter trace and TLC culture supernatantanalysis (for LNnT production) of a 1.5 L bioreactor culture of E866transformed with both pG222 and pG356.

DETAILED DESCRIPTION OF THE INVENTION

Described herein are genetic constructs and methods for production ofN-acetylglucosamine-containing human milk oligosaccharides (hMOS) andsialyloligosaccharides. In order to make bothN-acetylglucosamine-containing and sialyl-containing hMOS, one needs totap into the cellular UDP-GlcNAc pool. Doing so can be challenging,since UDP-GlcNAc is an essential metabolite for bacteria (used to makethe cell wall). The constructs, compositions, and methods of theinvention overcome difficulties of the past by enhancing the UDP-GlcNAcpool, a strategy that represents an advantage in the production of bothclasses of hMOS. Other distinctions over earlier approaches representimprovements and/or confer advantages over those earlier strategies.

hMOS

Human milk glycans, which comprise both oligosaccharides (hMOS) andtheir glycoconjugates, play significant roles in the protection anddevelopment of human infants, and in particular the infantgastrointestinal (GI) tract. Milk oligosaccharides found in variousmammals differ greatly, and their composition in humans is unique(Hamosh M., 2001 Pediatr Clin North Am, 48:69-86; Newburg D. S., 2001Adv Exp Med Biol, 501:3-10). Moreover, glycan levels in human milkchange throughout lactation and also vary widely among individuals(Morrow A. L. et al., 2004 J Pediatr, 145:297-303; Chaturvedi P et al.,2001 Glycobiology, 11:365-372). Previously, a full exploration of theroles of hMOS was limited by the inability to adequately characterizeand measure these compounds. In recent years sensitive and reproduciblequantitative methods for the analysis of both neutral and acidic hMOShave been developed (Erney, R., Hilty, M., Pickering, L., Ruiz-Palacios,G., and Prieto, P. (2001) Adv Exp Med Biol 501, 285-297. Bao, Y., andNewburg, D. S. (2008) Electrophoresis 29, 2508-2515). Approximately 200distinct oligosaccharides have been identified in human milk, andcombinations of a small number of simple epitopes are responsible forthis diversity (Newburg D. S., 1999 Curr_Med Chem, 6:117-127; NinonuevoM. et al., 2006 J Agric Food Chem, 54:7471-74801). hMOS are composed of5 monosaccharides: D-glucose (Glc), D-galactose (Gal),N-acetylglucosamine (GlcNAc), L-fucose (Fuc), and sialic acid (N-acetylneuraminic acid, Neu5Ac, NANA). hMOS are usually divided into two groupsaccording to their chemical structures: neutral compounds containingGlc, Gal, GlcNAc, and Fuc, linked to a lactose (Galβ1-4Glc) core, andacidic compounds including the same sugars, and often the same corestructures, plus NANA (Charlwood J. et al., 1999 Anal_Biochem,273:261-277; Martín-Sosa et al., 2003 J Dairy Sci, 86:52-59; ParkkinenJ. and Finne J., 1987 Methods Enzymol, 138:289-300; Shen Z. et al., 2001J Chromatogr A, 921:315-321). Approximately 70-80% of oligosaccharidesin human milk are fucosylated. A smaller proportion of theoligosaccharides in human milk are sialylated, or are both fucosylatedand sialylated.

Interestingly, hMOS as a class, survive transit through the intestine ofinfants very efficiently, a function of their being poorly transportedacross the gut wall and of their resistance to digestion by human gutenzymes (Chaturvedi, P., Warren, C. D., Buescher, C. R., Pickering, L.K. & Newburg, D. S. Adv Exp Med Biol 501, 315-323 (2001)). Oneconsequence of this survival in the gut is that hMOS are able tofunction as prebiotics, i.e. they are available to serve as an abundantcarbon source for the growth of resident gut commensal microorganisms(Ward, R. E., Niñonuevo, M., Mills, D. A., Lebrilla, C. B., and German,J. B. (2007) Mol Nutr Food Res 51, 1398-1405). Recently, there isburgeoning interest in the role of diet and dietary prebiotic agents indetermining the composition of the gut microflora, and in understandingthe linkage between the gut microflora and human health (Roberfroid, M.,Gibson, G. R., Hoyles, L., McCartney, A. L., Rastall, R., Rowland, I.,Wolvers, D., Watzl, B., Szajewska, H., Stahl, B., Guarner, F.,Respondek, F., Whelan, K., Coxam, V., Davicco, M. J., Léotoing, L.,Wittrant, Y., Delzenne, N. M., Cani, P. D., Neyrinck, A. M., andMeheust, A. (2010) Br J Nutr 104 Suppl 2, S1-63).

A number of human milk glycans possess structural homology to cellreceptors for enteropathogens, and serve roles in pathogen defense byacting as molecular receptor “decoys”. For example, pathogenic strainsof Campylobacter bind specifically to glycans in human milk containingthe H-2 epitope, i.e., 2′-fucosyl-N-acetyllactosamine or2′-fucosyllactose (2′-FL); Campylobacter binding and infectivity areinhibited by 2′-FL and other glycans containing this H-2 epitope(Ruiz-Palacios, G. M., Cervantes, L. E., Ramos, P., Chavez-Munguia, B.,and Newburg, D. S. (2003) J Biol Chem 278, 14112-14120). Similarly, somediarrheagenic E. coli pathogens are strongly inhibited in vivo by hMOScontaining 2′-linked fucose moieties. Several major strains of humancaliciviruses, especially the noroviruses, also bind to 2′-linkedfucosylated glycans, and this binding is inhibited by human milk2′-linked fucosylated glycans. Consumption of human milk that has highlevels of these 2′-linked fucosyloligosaccharides has been associatedwith lower risk of norovirus, Campylobacter, ST of E. coli-associateddiarrhea, and moderate-to-severe diarrhea of all causes in a Mexicancohort of breastfeeding children (Newburg D. S. et al., 2004Glycobiology, 14:253-263; Newburg D. S. et al., 1998 Lancet,351:1160-1164). Several pathogens are also known to utilize sialylatedglycans as their host receptors, such as influenza (Couceiro, J. N.,Paulson, J. C. & Baum, L. G. Virus Res 29, 155-165 (1993)),parainfluenza (Amonsen, M., Smith, D. F., Cummings, R. D. & Air, G. M. JVirol 81, 8341-8345 (2007), and rotoviruses (Kuhlenschmidt, T. B.,Hanafin, W. P., Gelberg, H. B. & Kuhlenschmidt, M. S. Adv Exp Med Biol473, 309-317 (1999)). The sialyl-Lewis X epitope is used by Helicobacterpylori (Mandavi, J., Sondén, B., Hurtig, M., Olfat, F. O., et al.Science 297, 573-578 (2002)), Pseudomonas aeruginosa (Scharfman, A.,Delmotte, P., Beau, J., Lamblin, G., et al. Glycoconj J 17, 735-740(2000)), and some strains of noroviruses (Rydell, G. E., Nilsson, J.,Rodriguez-Diaz, J., Ruvoën-Clouet, N., et al. Glycobiology 19, 309-320(2009)).

The nucleotide sugar uridine diphosphate N-acetylglucosamine(UDP-GlcNAc) is a key metabolic intermediate in bacteria, where it isinvolved in the synthesis and maintenance of the cell envelope. In allknown bacterial classes, UDP-GlcNAc is used to make peptidoglycan(murein); a polymer comprising the bacterial cell wall whose structuralintegrity is absolutely essential for growth and survival. In addition,gram-negative bacteria use UDP-GlcNAc for the synthesis of lipid A, animportant component of the outer cell membrane. Thus, for bacteria, theability to maintain an adequate intracellular pool of UDP-GlcNAc iscritical.

Biosynthesis of certain human milk oligosaccharides (hMOS) has beenachieved in engineered strains of the bacterium, Escherichia coli K12.As described herein, simple fucosylated hMOS, e.g. 2′-fucosyllactose(2′-FL), 3-fucosyllactose (3-FL), and lactodifucotetraose (LDFT), areproduced efficiently by live E. coli through artificially enhancingexisting intracellular pools of GDP-fucose (the nucleotide sugar donor)and lactose (the accepting sugar), and by then using these enhancedpools as substrates for heterologous recombinant fucosyltransferases(FIG. 1). Since neither the lactose nor GDP-fucose pools are essentialfor E. coli survival, biosynthesis of simple fucosylated hMOS isachieved at good yields without negative consequences on the hostbacterium's growth or viability. However, to synthesize more complexhMOS in E. coli, use of the critical bacterial UDP-GlcNAc pool isrequired, with consequent potential impacts on cell viability.

The UDP-GlcNAc pool in E. coli is produced through the combined actionof three glm genes, glmS (L-glutamine:D-fructose-6-phosphateaminotransferase), glmM (phosphoglucosamine mutase), and thebifunctional glmU (fused N-acetyl glucosamine-1-phosphateuridyltransferase and glucosamine-1-phosphate acetyl transferase) (FIG.2). These three genes direct a steady flow of carbon to UDP-GlcNAc, aflow that originates with fructose-6-phosphate (an abundant molecule ofcentral energy metabolism). Expression of the glm genes is underpositive control by the transcriptional activator protein, NagC.

When E. coli encounters glucosamine or N-acetyl-glucosamine in itsenvironment, these molecules are each transported into the cell viaspecific membrane transport proteins and are used either to supplementthe flow of carbon to the UDP-GlcNAc pool, or alternatively they areconsumed to generate energy, under the action of nag operon geneproducts (i.e. nagA [N-acetylglucosamine-6-phosphate deacetylase] andnagB [glucosamine-6-phosphate deaminase]). In contrast to the glm genes,expression of nagA and nagB are under negative transcriptional control,but by the same regulatory protein as the glm genes, i.e. NagC. NagC isthus bi-functional, able to activate UDP-GlcNAc synthesis, while at thesame time repressing the degradation of glucosamine-6-phosphate andN-acetylglucosamine-6-phosphate.

The binding of NagC to specific regulatory DNA sequences (operators),whether such binding results in gene activation or repression, issensitive to fluctuations in the cytoplasmic level of the small-moleculeinducer and metabolite, GlcNAc-6-phosphate. Intracellular concentrationsof GlcNAc-6-phosphate increase when N-acetylglucosamine is available asa carbon source in the environment, and thus under these conditions theexpression of the glm genes (essential to maintain the vital UDP-GlcNAcpool) would decrease, unless a compensatory mechanism is brought intoplay. E. coli maintains a baseline level of UDP-GlcNAc synthesis throughcontinuous expression of nagC directed by two constitutive promoters,located within the upstream nagA gene. This constitutive level of nagCexpression is supplemented approximately threefold under conditionswhere the degradative nag operon is induced, and by this means E. coliensures an adequate level of glm gene expression under all conditions,even when N-acetylglucosamine is being utilized as a carbon source.

Many hMOS incorporate GlcNAc into their structures directly, and manyalso incorporate sialic acid, a sugar whose synthesis involvesconsumption of UDP-GlcNAc (FIG. 3, FIG. 13). Thus, synthesis of manytypes of hMOS in engineered E. coli carries the significant risk ofreduced product yield and compromised cell viability resulting fromdepletion of the bacterium's UDP-GlcNAc pool. One way to address thisproblem during engineered synthesis of GlcNAc- or sialic acid-containinghMOS is to boost the UDP-GlcNAc pool through simultaneousover-expression of nagC, or preferably by simultaneous over-expressionof both nagC and glmS.

While studies suggest that human milk glycans could be used asprebiotics and as antimicrobial anti-adhesion agents, the difficulty andexpense of producing adequate quantities of these agents of a qualitysuitable for human consumption has limited their full-scale testing andperceived utility. What has been needed is a suitable method forproducing the appropriate glycans in sufficient quantities at reasonablecost. Prior to the invention described herein, there were attempts touse several distinct synthetic approaches for glycan synthesis. Novelchemical approaches can synthesize oligosaccharides (Flowers, H. M.Methods Enzymol 50, 93-121 (1978); Seeberger, P. H. Chem Commun (Camb)1115-1121 (2003)), but reactants for these methods are expensive andpotentially toxic (Koeller, K. M. & Wong, C. H. Chem Rev 100, 4465-4494(2000)). Enzymes expressed from engineered organisms (Albermann, C.,Piepersberg, W. & Wehmeier, U. F. Carbohydr Res 334, 97-103 (2001);Bettler, E., Samain, E., Chazalet, V., Bosso, C., et al. Glycoconj J 16,205-212 (1999); Johnson, K. F. Glycoconj J 16, 141-146 (1999); Palcic,M. M. Curr Opin Biotechnol 10, 616-624 (1999); Wymer, N. & Toone, E. J.Curr Opin Chem Biol 4, 110-119 (2000)) provide a precise and efficientsynthesis (Palcic, M. M. Curr Opin Biotechnol 10, 616-624 (1999));Crout, D. H. & Vic, G. Curr Opin Chem Biol 2, 98-111 (1998)), but thehigh cost of the reactants, especially the sugar nucleotides, limitstheir utility for low-cost, large-scale production. Microbes have beengenetically engineered to express the glycosyltransferases needed tosynthesize oligosaccharides from the bacteria's innate pool ofnucleotide sugars (Endo, T., Koizumi, S., Tabata, K., Kakita, S. &Ozaki, A. Carbohydr Res 330, 439-443 (2001); Endo, T., Koizumi, S.,Tabata, K. & Ozaki, A. Appl Microbiol Biotechnol 53, 257-261 (2000);Endo, T. & Koizumi, S. Curr Opin Struct Biol 10, 536-541 (2000); Endo,T., Koizumi, S., Tabata, K., Kakita, S. & Ozaki, A. Carbohydr Res 316,179-183 (1999); Koizumi, S., Endo, T., Tabata, K. & Ozaki, A. NatBiotechnol 16, 847-850 (1998)). However, low overall product yields andhigh process complexity have limited the commercial utility of theseapproaches.

Prior to the invention described herein, which enables the inexpensiveproduction of large quantities of neutral and acidic hMOS, it had notbeen possible to fully investigate the ability of this class of moleculeto inhibit pathogen binding, or indeed to explore their full range ofpotential additional functions.

Prior to the invention described herein, chemical syntheses of hMOS werepossible, but were limited by stereo-specificity issues, precursoravailability, product impurities, and high overall cost (Flowers, H. M.Methods Enzymol 50, 93-121 (1978); Seeberger, P. H. Chem Commun (Camb)1115-1121 (2003); Koeller, K. M. & Wong, C. H. Chem Rev 100, 4465-4494(2000)). Also, prior to the invention described herein, in vitroenzymatic syntheses were also possible, but were limited by arequirement for expensive nucleotide-sugar precursors. The inventionovercomes the shortcomings of these previous attempts by providing newstrategies to inexpensively manufacture large quantities of human milkoligosaccharides for use as dietary supplements. The invention describedherein makes use of an engineered bacterium E. coli (or other bacteria)engineered to produce sialylated oligosaccharides in commercially viablelevels, for example the methods described herein enable the productionof 3′-SL at >50 g/L in bioreactors.

Variants and Functional Fragments

The present invention features introducing exogenous genes intobacterium to manipulate the pathways to increase UDP-GlcNAc pools, toproduce sialylated oligosaccharides and to produceN-acetylglucosamine-containing oligosaccharides. In any of the methodsdescribed herein, the genes or gene products may be variants orfunctional fragments thereof.

A variant of any of genes or gene products disclosed herein may have50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99%sequence identity to the nucleic acid or amino acid sequences describedherein. The term “% identity,” in the context of two or more nucleicacid or polypeptide sequences, refer to two or more sequences orsubsequences that are the same or have a specified percentage of aminoacid residues or nucleotides that are the same, when compared andaligned for maximum correspondence, as measured using one of thefollowing sequence comparison algorithms or by visual inspection. Forexample, % identity is relative to the entire length of the codingregions of the sequences being compared, or the length of a particularfragment or functional domain thereof.

Variants as disclosed herein also include homolog, orthologs, orparalogs of the genes or gene products described herein that retain thesame biological function as the genes or gene products specified herein.These variants can be used interchangeably with the genes recited inthese methods. Such variants may demonstrate a percentage of homology oridentity, for example, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%,96%, 97%, 98%, or 99% identity conserved domains important forbiological function, preferably in a functional domain, e.g. catalyticdomain.

For sequence comparison, one sequence acts as a reference sequence, towhich test sequences are compared. When using a sequence comparisonalgorithm, test and reference sequences are input into a computer,subsequence coordinates are designated, if necessary, and sequencealgorithm program parameters are designated. The sequence comparisonalgorithm then calculates the percent sequence identity for the testsequence(s) relative to the reference sequence, based on the designatedprogram parameters. Percent identity is determined using BLAST andPSI-BLAST (Altschul et al., 1990, J Mol Biol 215:3, 403-410; Altschul etal., 1997, Nucleic Acids Res 25:17, 3389-402). For the PSI-BLAST search,the following exemplary parameters are employed: (1) Expect thresholdwas 10; (2) Gap cost was Existence:11 and Extension:1; (3) The Matrixemployed was BLOSUM62; (4) The filter for low complexity regions was“on”.

Changes can be introduced by mutation into the nucleic acid sequence oramino acid sequence of any of the genes or gene products describedherein, leading to changes in the amino acid sequence of the encodedprotein or enzyme, without altering the functional ability of theprotein or enzyme. For example, nucleotide substitutions leading toamino acid substitutions at “non-essential” amino acid residues can bemade in the sequence of any of sequences expressly disclosed herein. A“non-essential” amino acid residue is a residue at a position in thesequence that can be altered from the wild-type sequence of thepolypeptide without altering the biological activity, whereas an“essential” amino acid residue is a residue at a position that isrequired for biological activity. For example, amino acid residues thatare conserved among members of a family of proteins are not likely to beamenable to mutation. Other amino acid residues, however, (e.g., thosethat are poorly conserved among members of the protein family) may notbe as essential for activity and thus are more likely to be amenable toalteration. Thus, another aspect of the invention pertains to nucleicacid molecules encoding the proteins or enzymes disclosed herein thatcontain changes in amino acid residues relative to the amino acidsequences disclosed herein that are not essential for activity.

An isolated nucleic acid molecule encoding a protein homologous to anyof the genes described herein can be created by introducing one or morenucleotide substitutions, additions or deletions into the correspondingnucleotide sequence, such that one or more amino acid substitutions,additions or deletions are introduced into the encoded protein.

Mutations can be introduced into a nucleic acid sequence such that theencoded amino acid sequence is altered by standard techniques, such assite-directed mutagenesis and PCR-mediated mutagenesis. Preferably,conservative amino acid substitutions are made at one or more predictednon-essential amino acid residues. A “conservative amino acidsubstitution” is one in which the amino acid residue is replaced with anamino acid residue having a similar side chain. Families of amino acidresidues having similar side chains have been defined in the art.Certain amino acids have side chains with more than one classifiablecharacteristic. These families include amino acids with basic sidechains (e.g., lysine, arginine, histidine), acidic side chains (e.g.,aspartic acid, glutamic acid), uncharged polar side chains (e.g.,glycine, asparagine, glutamine, serine, threonine, tyrosine, tryptophan,cysteine), nonpolar side chains (e.g., alanine, valine, leucine,isoleucine, proline, phenylalanine, methionine, tyrosine, tryptophan),beta-branched side chains (e.g., threonine, valine, isoleucine) andaromatic side chains (e.g., tyrosine, phenylalanine, tryptophan,histidine). Thus, a predicted nonessential amino acid residue in a givenpolypeptide is replaced with another amino acid residue from the sameside chain family. Alternatively, in another embodiment, mutations canbe introduced randomly along all or part of a given coding sequence,such as by saturation mutagenesis, and the resultant mutants can bescreened for given polypeptide biological activity to identify mutantsthat retain activity. Conversely, the invention also provides forvariants with mutations that enhance or increase the endogenousbiological activity. Following mutagenesis of the nucleic acid sequence,the encoded protein can be expressed by any recombinant technology knownin the art and the activity of the protein can be determined. Anincrease, decrease, or elimination of a given biological activity of thevariants disclosed herein can be readily measured by the ordinary personskilled in the art, i.e., by measuring the capability for mediatingoligossacharide modification, synthesis, or degradation (via detectionof the products).

The present invention also provides for functional fragments of thegenes or gene products described herein. A fragment, in the case ofthese sequences and all others provided herein, is defined as a part ofthe whole that is less than the whole. Moreover, a fragment ranges insize from a single nucleotide or amino acid within a polynucleotide orpolypeptide sequence to one fewer nucleotide or amino acid than theentire polynucleotide or polypeptide sequence. Finally, a fragment isdefined as any portion of a complete polynucleotide or polypeptidesequence that is intermediate between the extremes defined above.

For example, fragments of any of the proteins or enzymes disclosedherein or encoded by any of the genes disclosed herein can be 10 to 20amino acids, 10 to 30 amino acids, 10 to 40 amino acids, 10 to 50 aminoacids, 10 to 60 amino acids, 10 to 70 amino acids, 10 to 80 amino acids,10 to 90 amino acids, 10 to 100 amino acids, 50 to 100 amino acids, 75to 125 amino acids, 100 to 150 amino acids, 150 to 200 amino acids, 200to 250 amino acids, 250 to 300 amino acids, 300 to 350 amino acids, 350to 400 amino acids, 400 to 450 amino acids, or 450 to 500 amino acids.The fragments encompassed in the present invention comprise fragmentsthat retain functional fragments. As such, the fragments preferablyretain the catalytic domains that are required or are important forfunctional activity. Fragments can be determined or generated by usingthe sequence information herein, and the fragments can be tested forfunctional activity using standard methods known in the art. Forexample, the encoded protein can be expressed by any recombinanttechnology known in the art and the activity of the protein can bedetermined. The biological function of said fragment can be measured bymeasuring ability to synthesize or modify a substrate oligosaccharide,or conversely, to catabolize an oligosaccharide substrate.

EXAMPLE 1 Engineering of E. coli to Generate Host Strains for theProduction of N-acetylglucosamine-Containing Human Milk Oligosaccharides

The E. coli K12 prototroph, W3110, was chosen as the parent backgroundfor hMOS biosynthesis. This strain had previously been modified at theampC locus by the introduction of a tryptophan-inducible P_(trpB)-cI+repressor construct (McCoy, J. & Lavallie, E. Current protocols inmolecular biology/edited by Frederick M. Ausubel et al., (2001)),enabling economical production of recombinant proteins from the phage λP_(L) promoter (Sanger, F., Coulson, A. R., Hong, G. F., Hill, D. F. &Petersen, G. B. J Mol Biol 162, 729-773 (1982)) through induction withmillimolar concentrations of tryptophan (Mieschendahl, M., Petri, T. &Hänggi, U. Nature Biotechnology 4, 802-808 (1986)). The strain GI724, anE. coli W3110 derivative containing the tryptophan-inducibleP_(trpB)-cI+ repressor construct in ampC, was used at the basis forfurther E. coli strain manipulations

Biosynthesis of hMOS requires the generation of an enhanced cellularpool of lactose. This enhancement was achieved in strain GI724 throughseveral manipulations of the chromosome using λ, Red recombineering(Court, D. L., Sawitzke, J. A. & Thomason, L. C. Annu Rev Genet 36,361-388 (2002)) and generalized P1 phage transduction (Thomason, L. C.,Costantino, N. & Court, D. L. Mol Biol Chapter 1, Unit 1.17 (2007)). Theability of the E. coli host strain to accumulate intracellular lactosewas first engineered by simultaneous deletion of the endogenousβ-galactosidase gene (lacZ) and the lactose operon repressor gene(lacI). During construction of this deletion, the lacIq promoter wasplaced immediately upstream of the lactose permease gene, lacY. Themodified strain thus maintains its ability to transport lactose from theculture medium (via LacY), but is deleted for the wild-type copy of thelacZ (β-galactosidase) gene responsible for lactose catabolism. Anintracellular lactose pool is therefore created when the modified strainis cultured in the presence of exogenous lactose.

An additional modification useful for increasing the cytoplasmic pool offree lactose (and hence the final yield of hMOS) is the incorporation ofa lacA mutation. LacA is a lactose acetyltransferase that is only activewhen high levels of lactose accumulate in the E. coli cytoplasm. Highintracellular osmolarity (e.g., caused by a high intracellular lactosepool) can inhibit bacterial growth, and E. coli has evolved a mechanismfor protecting itself from high intra cellular osmolarity caused bylactose by “tagging” excess intracellular lactose with an acetyl groupusing LacA, and then actively expelling the acetyl-lactose from the cell(Danchin, A. Bioessays 31, 769-773 (2009)). Production of acetyl-lactosein E. coli engineered to produce human milk oligosaccharides istherefore undesirable: it reduces overall yield. Moreover,acetyl-lactose is a side product that complicates oligosaccharidepurification schemes. The incorporation of a lacA mutation resolvesthese problems, as carrying a deletion of the lacA gene renders thebacterium incapable of synthesizing acetyl-lactose.

A thyA (thymidylate synthase) mutation was introduced by almost entirelydeleting the thyA gene and replacing it by an inserted functional,wild-type, but promoter-less E. coli lacZ⁺ gene carrying the 2.8ribosome binding site (SEQ ID NO: 10) (ΔthyA::(2.8RBS lacZ⁺,kan^(r)). λRed recombineering was used to perform the construction. FIG. 4illustrates the new configuration of genes thus engineered at the thyAlocus. The complete DNA sequence of the region, with annotations inGenBank format is disclosed herein. Genomic DNA sequence surrounding thelacZ+ insertion into the thyA region is set forth in SEQ ID NO: 1.

The thyA defect can be complemented in trans by supplying a wild-typethyA gene on a multicopy plasmid (Belfort, M., Maley, G. F. & Maley, F.Proceedings of the National Academy of Sciences 80, 1858 (1983)). Thiscomplementation is used herein as a means of plasmid maintenance(eliminating the need for a more conventional antibiotic selectionscheme to maintain plasmid copy number).

The genotype of strain E680 is given below. E680 incorporates all thechanges discussed above and is a host strain suitable for the productionof N-acetylglucosamine-containing oligosaccharides.

F′402 proA+B+, PlacIq-lacY, Δ(lacI-lacZ) 158, ΔlacA398/araC, Δgpt-mhpC,ΔthyA::(2.8RBS lacZ+, KAN), rpoS+, rph+, ampC::(Ptrp T7g10 RBS-λcI+,CAT)

E796 is a strain similar to E680 and carries a thyA (thymidylatesynthase) mutation, introduced by almost entirely deleting the thyA geneand replacing it by an inserted functional, wild-type, but promoter-lessE. coli lacZ⁺ gene but carrying the 0.8 ribosome binding site (SEQ IDNO: 11) [ΔthyA::(0.8RBS lacZ+, KAN)]. The genotype of strain E796 isgiven below. E796 incorporates all the changes discussed above and is ahost strain suitable for the production ofN-acetylglucosamine-containing oligosaccharides.

F′402 proA+B+, PlacIq-lacY, Δ(lacI-lacZ) 158, ΔlacA398/araC, Δgpt-mhpC,ΔthyA::(2.8RBS lacZ+, KAN), rpoS+, rph+, ampC::(Ptrp T7g10 RBS-λcI+,CAT)

E866 is a strain similar to E796 and is useful for dual plasmidselection. E866 also carries a thyA (thymidylate synthase) mutation,introduced by almost entirely deleting the thyA gene and replacing it byan inserted functional, wild-type, but promoter-less E. coli lacZ⁺ geneand carrying the 0.8 ribosome binding site (SEQ ID NO: 11)[ΔthyA::(0.8RBS lacZ+)]. In addition to the thyA deletion E866 alsocarries a deletion of the purA gene. The genotype of strain E866 isgiven below. E866 incorporates all the changes discussed above and is ahost strain suitable for the production ofN-acetylglucosamine-containing oligosaccharides.

F′402 proA+B+, PlacIq-lacY, Δ(lacI-lacZ) 158, ΔlacA398/araC, Δgpt-mhpC,ΔthyA::(0.8RBS lacZ+), rpoS+, rph+, ampC::(Ptrp T7g10 RBS-λcI+, CAT),ΔpurA727::KAN

EXAMPLE 2 Production of N-acetylglucosamine-containing Human MilkOligosaccharides in E. coli: Lacto-N-tetraose (LNT) andLacto-N-neotetraose (LNnT)

The first step in the synthesis (from a lactose precursor) of bothLacto-N-tetraose (LNT) and Lacto-N-neotetraose (LNnT) is the addition ofa β(1,3)N-acetylglucosamine residue to lactose, utilizing a heterologousβ(1,3)-N-acetylglucosaminyltransferase to form Lacto-N-triose 2 (LNT2).The plasmid pG292 (ColE1, thyA+, bla+, P_(L)-lgtA) (SEQ ID NO: 2, FIG.5) carries the lgtA β(1,3)-N-acetylglucosaminyltransferase gene of N.meningitidis and can direct the production of LNT2 in E. coli strainE680 under appropriate culture conditions. pG221 (ColE1, thyA+, bla+,P_(L)-lgtA-wbgO) (SEQ ID NO: 3, FIG. 6) is a derivative of pG292 thatcarries (arranged as an operon) both the lgtAβ(1,3)-N-acetylglucosaminyltransferase gene of N. meningitidis and thewbgO β(1,3)-galactosyltransferase gene of E. coli O55:H7. pG221 directsthe production of LNT in E. coli strain E680 under appropriate cultureconditions. pG222 (ColE1, thyA+, bla+, P_(L)—IgtA-4GalT) (SEQ ID NO: 4,FIG. 7) is a derivative of pG292 that carries (arranged as an operon)both the lgtA β(1,3)-N-acetylglucosaminyltransferase gene of N.meningitidis and the 4GalT (jhp0765) β(1,4)-galactosyltransferase geneof H. pylori. pG222 directs the production of LNnT in E. coli strainE680 under appropriate culture conditions.

The addition of tryptophan to the lactose-containing growth medium ofcultures of any one of the E680-derivative strains transformed withplasmids pG292, pG221 or pG222 leads, for each particular E680/plasmidcombination, to activation of the host E. coli tryptophan utilizationrepressor TrpR, subsequent repression of P_(trpB), and a consequentdecrease in cytoplasmic cI levels, which results in a de-repression ofP_(L), expression of lgtA, lgtA+wbgO, or IgtA+4GalT respectively, andproduction of LNT2, LNT, or LNnT respectively.

For LNT2, LNT, or LNnT production in small scale laboratory cultures(<100 ml), strains were grown at 30° C. in a selective medium lackingboth thymidine and tryptophan to early exponential phase (e.g., M9salts, 0.5% glucose, 0.4% casaminoacids). Lactose was then added to afinal concentration of 0.5 or 1%, along with tryptophan (200 μM final)to induce expression of the respective glycosyltransferases, driven fromthe P_(L) promoter. At the end of the induction period (˜24 h), TLCanalysis was performed on aliquots of cell-free culture medium. FIG. 8illustrates schematically the enzymatic reactions necessary to producefrom lactose, via the intermediate trisaccharide lacto-N-triose 2(LNT2), the two human milk oligosaccharides; Lacto-N-tetraose (LNT) andLacto-N-neotetraose (LNnT). A thin layer chromatogram (on left) ispresented of culture medium samples taken from small scale E. colicultures and demonstrating synthesis of LNT2, LNT, and LNnT (utilizinginduced, lactose-containing cultures of E680 transformed with pG292,pG221 or pG222 respectively). A second thin layer chromatogram (onright) is presented of culture medium samples taken from an E. coliE680/pG222 15 L bioreactor culture and demonstrating synthesis of LNnT(as well as the higher molecular weight hMOS, Lacto-N-neohexaose, LNnH).

Although the above results clearly demonstrate how it is possible tosynthesize GlcNAc-containing oligosaccharides (i.e. LNT2, LNT and LNnT)in engineered E. coli, FIG. 14 illustrates a serious problem faced whenattempting to use the E. coli UDP-GlcNAc pool during such syntheses. InFIG. 14 four separate cultures of E680, transformed with pG292, weregrown in the presence and absence of lactose, and with LgtA expressionboth induced and uninduced by tryptophan addition. It can clearly beseen that massive cell lysis occurs in the cultures where lactose ispresent—i.e. in those cultures where LgtA draws down the cellularUDP-GlcNAc pool by adding GlcNAc to lactose (and making LNT2). In sodoing, UDP-GlcNAc is diverted from cell wall biosynthesis towards hMOSbiosynthesis, and cell lysis results. This lysis can be monitoredreadily not only by the precipitous drop in culture density as seen inthe figure, but also by the appearance of DNA in the culture medium.

EXAMPLE 3 Boosting the Cellular UDP-GlcNAc Pool Prevents Cell LysisDuring the Biosynthesis of LNnT in Engineered E. coli

To examine the impact of enhancing the E. coli cellular UDP-GlcNAc poolduring synthesis of N-acetylglucosamine-containing hMOS the p15Areplicon plasmid pG356 was constructed (FIG. 19 and SEQ ID NO:12). pG356carries a p15A replicon (compatible with ColE1 replicons), purA and ampCselectable markers, and a synthetic operon (under control of the pLpromoter) carrying the E. coli glmS (encodingL-glutamine:D-fructose-6-phosphate aminotransferase) and nagC (encodingthe bi-functional transcriptional activator/repressor of glm and nagoperons) genes. When pL is active in strains carrying the plasmid pG356,the UDP-GlcNAc pool increases. Strain E796 (see example 1) wastransformed with pG222 (FIG. 7), and strain E866 (see example 1) wastransformed with both pG222 (FIG. 7) and pG356 (FIG. 19). (Strains E796and E866 are isogenic save for the purA mutation found in E866 that isused for pG356 plasmid retention). Identical 1.5 L fermentation runswere performed on each of the transformed strains. Optical density ofthe cultures and LNnT biosynthesis was followed, along with standardfermentation parameters. As can be seen in FIG. 18, the E796/pG222culture produced LNnT, but lysed when the cell density reached 75 OD600,and achieved a final cell density at end-of-fermentation of only 50OD600. In contrast (FIG. 19) with the E866/pG222+pG356 culture (whereexpression of the glmS and bagC genes enhance the intracellularUDP-GlcNAc pool) LNnT was also produced, but with no cell lysisobserved. In this culture end-of-fermentation cell density reached 108OD600—more than twice the density achieved for E796/pG222.

EXAMPLE 4 Production of 6′-sialyllactose (6′-SL) by Engineered E. coli(ΔnanRATEK)

For the production of 6′ sialyllactose, Escherichia coli GI724(ATCC55151) was engineered with a set of mutations that causecytoplasmic accumulation of non-acetylated lactose precursor and preventthe degradation of N-acetyl-5-neuraminic acid (FIG. 3). In particular,the lacZ (β-galactosidase) and lacA (lactose acetyl transferase) genesfrom the lac operon were deleted, leaving the LacIq repressor and theLacY permease fully functional. The LacY permease can be driven by weak(e.g. lac8) or strong (e.g. Ptac) promoters. The entire nan operon(nanRATEK; structural and regulatory genes involved in neuraminic aciddegradation) was deleted in this example. E. coli genome manipulationswere achieved using a combination of standard molecular geneticstechniques, specifically lambda-Red recombineering, allele exchangeswith positive selection suicide vectors, and P1 transductions (FIG. 3).The host genotype of strain E781, suitable for production of sialylatedhMOS, is presented below:

-   ampC::(Ptrp-λcI+), lacIq lacPL8, ΔnanRATEK471, ΔlacZ690, ΔlacA 745

To produce 6′-sialyllactose, the cellular UDP-GlcNAc pool must beconverted into the sugar-nucleotide activated precursor, CMP-NeuAc,which in turn can function as a donor molecule for a sugar acceptor(i.e. lactose) in a sialyltransferase-catalyzed reaction (FIG. 3). Tothis purpose, three genes from Campylobacter jejuni ATCC43438, encodingi) UDP-N-acetylglucosamine 2-epimerase (NeuC), ii) N-acetylneuraminatesynthase (NeuB), and iii) N-Acetylneuraminate cytidylyltransferase(NeuA), were constitutively co-expressed in the engineered E. colistrain described above, along with a gene encoding an α(2,6)sialyltransferase from Photobacterium spp JT-ISH-224 (SEQ ID NO:21Genbank protein Accession BAF92026, incorporated herein by reference).The neu genes were expressed from a low copy number plasmid vector(pG317, FIG. 9, SEQ ID NO: 5) carrying a constitutive lac promoter(pBBR1 ori, cat+, Plac), while the α(2,6)sialyltransferase gene wasexpressed from a high copy number plasmid vector (pG315, FIG. 10, SEQ IDNO: 6) carrying a constitutive lac promoter (ColE1 ori, bla+, Plac). Toprevent the synthesis of side-products, the relative expression for theα(2,6)sialyltransferase gene compared to the neu genes is modulated byengineering differing ribosomal binding sites (RBS) providing variousdegrees of translational efficiency upstream of theα(2,6)sialyltransferase gene. Engineered strains were grown to highdensity in pilot scale fermentors using a batch to fed-batch strategy.FIG. 11 is a TLC analysis of culture supernatants from two suchfermentations, with samples to the left of the figure being taken from afermentation of a strain containing pG315 (and thus carrying the RBSpresented in SEQ ID NO: 7 in front of the α(2,6)sialyltransferase genein the vector). Samples on the right of the figure are taken from afermentation of a strain containing a close variant of pG315 (pG345,FIG. 12, SEQ ID NO:9, carrying the weaker RBS presented in SEQ ID NO: 8in front of the α(2,6)sialyltransferase gene and replacing the RBSpresented in SEQ ID NO: 7). In both cases, the lactose precursor wasadded at a cell density of 50 OD₆₀₀ and efficient conversion to finalproducts was achieved within 48 hours from the lactose addition. Thefinal yield of 6′SL was increased when utilizing the plasmid with theweaker RBS upstream of the α(2,6)sialyltransferase gene, and moreoverthe level of KDO-lactose side product is very significantly decreasedusing this weaker RBS. The identity of the 6′-SL purified usingactivated carbon column chromatography was confirmed by ESI massspectrometry and NMR.

EXAMPLE 5 Production of 6′-sialyllactose (6′-SL) by engineered E. coli.(ΔnanA, ΔnanATE)

For the production of 6′ sialyllactose, Escherichia coli GI724(ATCC55151) was engineered with a set of mutations that causecytoplasmic accumulation of non-acetylated lactose precursor and preventthe degradation of N-acetyl-5-neuraminic acid (FIG. 13). In particular,the lacZ (β-galactosidase) and lacA (lactose acetyl transferase) genesfrom the lac operon were deleted, leaving the Laclq repressor and theLacY permease fully functional. The LacY permease can be driven by weak(e.g. lac8) or strong (e.g. Ptac) promoters. While the entire nan operon(nanRATEK; structural and regulatory genes involved in neuraminic aciddegradation) can be deleted to abolish neuraminic acid catabolism as inExample 4, lesser deletions encompassing just the nanA, or nanA, nanTand nanE, or nanA and nanE genes, are also suitable. In all theinstances where the nanE gene was mutated, the last 104 bp of the nanEgene were left intact to allow for undisturbed transcription/translationof downstream nanK, although other lengths of residual nanE sequence arepossible. E. coli genome manipulations were achieved using a combinationof standard molecular genetics techniques, specifically lambda-Redrecombineering, allele exchanges with positive selection suicidevectors, and P1 transductions (FIG. 13). The host genotypes of strainsE971, E1017 and E1018, suitable for production of sialylated hMOS withvarious yield and purity, are presented below:

-   ampC::(Ptrp-λcI+), lacIq lacPL8, ΔnanA:: kanR, ΔlacZ690,    ΔlacA::scar,-   ampC::(Ptrp-λcI+), lacIq lacPL8, ΔnanATE::kanR::nanK+, ΔlacZ690,    ΔlacA:: scar and-   ampC::(Ptrp-λcI+), lacIq lacPL8, ΔnanATE::scar::nanK+, ΔlacZ690,    ΔlacA:: scar respectively

To produce 6′-sialyllactose, the cellular UDP-GlcNAc pool must beconverted into the sugar-nucleotide activated precursor, CMP-NeuAc,which in turn can function as a donor molecule for a sugar acceptor(i.e. lactose) in a sialyltransferase-catalyzed reaction (FIG. 13). Tothis purpose, three genes from Campylobacter jejuni ATCC43438, encodingi) UDP-N-acetylglucosamine 2-epimerase (NeuC), ii) N-acetylneuraminatesynthase (NeuB), and iii) N-Acetylneuraminate cytidylyltransferase(NeuA), were constitutively co-expressed in the engineered E. colistrain described above, along with a gene encoding an α(2,6)sialyltransferase from Photobacterium spp JT-ISH-224. The neu genes wereexpressed from a low copy number plasmid vector (pG317, FIG. 9, SEQ IDNO: 5) carrying a constitutive lac promoter (pBBR1 ori, cat+, Plac),while the α(2,6)sialyltransferase gene was expressed from the weak RBSof SEQ ID NO: 8 in a high copy number plasmid vector (pG345, FIG. 12,SEQ ID NO: 9) carrying a constitutive lac promoter (ColE1 ori, bla+,Plac). Engineered strains were grown to high density in pilot scalefermentors using a batch to fed-batch strategy. FIG. 14 is a TLCanalysis of culture pellets or supernatants from three suchfermentations. Panel A shows production and accumulation of 6′SL in thecells of three genetic backgrounds (only the relevant nan mutations areshown for strains E971, E1017 and E1018), Panel B and C show productionand accumulation of 6′SL in the extracellular milieu (supernatants) instrains E971, E1017 and E1018 (only the relevant nan mutations areshown) with estimated maximum volumetric yields of 15 g per liter ofsupernatant. In all cases, the lactose precursor was added at a celldensity of 40 OD₆₀₀ and steady state conversion to final products wasachieved within approximately 90 hours from the lactose addition (EFT iselapsed fermentation time).

The various sequences presented herein are recited below.

SEQ ID NO: 1 >E680_thyA::2.8RBS_lacZ Escherichia coli str.GCAGCGGAACTCACAAGGCACCATAACGTCCCCTCCCTGATAACGCTGATACTGTGGTCGCGGTTATGCCAGTTGGCATCTTCACGTAAATAGAGCAAATAGTCCCGCGCCTGGCTGGCGGTTTGCCATAGCCGTTGCGACTGCTGCCAGTATTGCCAGCCATAGAGTCCACTTGCGCTTAGCATGACCAAAATCAGCATCGCGACCAGCGTTTCAATCAGCGTATAACCACGTTGTGTTTTCATGCCGGCAGTATGGAGCGAGGAGAAAAAAAGACGAGGGCCAGTTTCTATTTCTTCGGCGCATCTTCCGGACTATTTACGCCGTTGCAGGACGTTGCAAAATTTCGGGAAGGCGTCTCGAAGAATTTAACGGAGGGTAAAAAAACCGACGCACACTGGCGTCGGCTCTGGCAGGATGTTTCGTAATTAGATAGCCACCGGCGCTTTattaaacctactATGACCATGATTACGGATTCACTGGCCGTCGTTTTACAACGTCGTGACTGGGAAAACCCTGGCGTTACCCAACTTAATCGCCTTGCAGCACATCCCCCTTTCGCCAGCTGGCGTAATAGCGAAGAGGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCTGAATGGCGAATGGCGCTTTGCCTGGTTTCCGGCACCAGAAGCGGTGCCGGAAAGCTGGCTGGAGTGCGATCTTCCTGAGGCCGATACTGTCGTCGTCCCCTCAAACTGGCAGATGCACGGTTACGATGCGCCCATCTACACCAACGTGACCTATCCCATTACGGTCAATCCGCCGTTTGTTCCCACGGAGAATCCGACGGGTTGTTACTCGCTCACATTTAATGTTGATGAAAGCTGGCTACAGGAAGGCCAGACGCGAATTATTTTTGATGGCGTTAACTCGGCGTTTCATCTGTGGTGCAACGGGCGCTGGGTCGGTTACGGCCAGGACAGTCGTTTGCCGTCTGAATTTGACCTGAGCGCATTTTTACGCGCCGGAGAAAACCGCCTCGCGGTGATGGTGCTGCGCTGGAGTGACGGCAGTTATCTGGAAGATCAGGATATGTGGCGGATGAGCGGCATTTTCCGTGACGTCTCGTTGCTGCATAAACCGACTACACAAATCAGCGATTTCCATGTTGCCACTCGCTTTAATGATGATTTCAGCCGCGCTGTACTGGAGGCTGAAGTTCAGATGTGCGGCGAGTTGCGTGACTACCTACGGGTAACAGTTTCTTTATGGCAGGGTGAAACGCAGGTCGCCAGCGGCACCGCGCCTTTCGGCGGTGAAATTATCGATGAGCGTGGTGGTTATGCCGATCGCGTCACACTACGTCTGAACGTCGAAAACCCGAAACTGTGGAGCGCCGAAATCCCGAATCTCTATCGTGCGGTGGTTGAACTGCACACCGCCGACGGCACGCTGATTGAAGCAGAAGCCTGCGATGTCGGTTTCCGCGAGGTGCGGATTGAAAATGGTCTGCTGCTGCTGAACGGCAAGCCGTTGCTGATTCGAGGCGTTAACCGTCACGAGCATCATCCTCTGCATGGTCAGGTCATGGATGAGCAGACGATGGTGCAGGATATCCTGCTGATGAAGCAGAACAACTTTAACGCCGTGCGCTGTTCGCATTATCCGAACCATCCGCTGTGGTACACGCTGTGCGACCGCTACGGCCTGTATGTGGTGGATGAAGCCAATATTGAAACCCACGGCATGGTGCCAATGAATCGTCTGACCGATGATCCGCGCTGGCTACCGGCGATGAGCGAACGCGTAACGCGAATGGTGCAGCGCGATCGTAATCACCCGAGTGTGATCATCTGGTCGCTGGGGAATGAATCAGGCCACGGCGCTAATCACGACGCGCTGTATCGCTGGATCAAATCTGTCGATCCTTCCCGCCCGGTGCAGTATGAAGGCGGCGGAGCCGACACCACGGCCACCGATATTATTTGCCCGATGTACGCGCGCGTGGATGAAGACCAGCCCTTCCCGGCTGTGCCGAAATGGTCCATCAAAAAATGGCTTTCGCTACCTGGAGAGACGCGCCCGCTGATCCTTTGCGAATACGCCCACGCGATGGGTAACAGTCTTGGCGGTTTCGCTAAATACTGGCAGGCGTTTCGTCAGTATCCCCGTTTACAGGGCGGCTTCGTCTGGGACTGGGTGGATCAGTCGCTGATTAAATATGATGAAAACGGCAACCCGTGGTCGGCTTACGGCGGTGATTTTGGCGATACGCCGAACGATCGCCAGTTCTGTATGAACGGTCTGGTCTTTGCCGACCGCACGCCGCATCCAGCGCTGACGGAAGCAAAACACCAGCAGCAGTTTTTCCAGTTCCGTTTATCCGGGCAAACCATCGAAGTGACCAGCGAATACCTGTTCCGTCATAGCGATAACGAGCTCCTGCACTGGATGGTGGCGCTGGATGGTAAGCCGCTGGCAAGCGGTGAAGTGCCTCTGGATGTCGCTCCACAAGGTAAACAGTTGATTGAACTGCCTGAACTACCGCAGCCGGAGAGCGCCGGGCAACTCTGGCTCACAGTACGCGTAGTGCAACCGAACGCGACCGCATGGTCAGAAGCCGGGCACATCAGCGCCTGGCAGCAGTGGCGTCTGGCGGAAAACCTCAGTGTGACGCTCCCCGCCGCGTCCCACGCCATCCCGCATCTGACCACCAGCGAAATGGATTTTTGCATCGAGCTGGGTAATAAGCGTTGGCAATTTAACCGCCAGTCAGGCTTTCTTTCACAGATGTGGATTGGCGATAAAAAACAACTGtTGACGCCGCTGCGCGATCAGTTCACCCGTGCACCGCTGGATAACGACATTGGCGTAAGTGAAGCGACCCGCATTGACCCTAACGCCTGGGTCGAACGCTGGAAGGCGGCGGGCCATTACCAGGCCGAAGCAGCGTTGTTGCAGTGCACGGCAGATACACTTGCTGATGCGGTGCTGATTACGACCGCTCACGCGTGGCAGCATCAGGGGAAAACCTTATTTATCAGCCGGAAAACCTACCGGATTGATGGTAGTGGTCAAATGGCGATTACCGTTGATGTTGAAGTGGCGAGCGATACACCGCATCCGGCGCGGATTGGCCTGAACTGCCAGCTGGCGCAGGTAGCAGAGCGGGTAAACTGGCTCGGATTAGGGCCGCAAGAAAACTATCCCGACCGCCTTACTGCCGCCTGTTTTGACCGCTGGGATCTGCCATTGTCAGACATGTATACCCCGTACGTCTTCCCGAGCGAAAACGGTCTGCGCTGCGGGACGCGCGAATTGAATTATGGCCCACACCAGTGGCGCGGCGACTTCCAGTTCAACATCAGCCGCTACAGTCAACAGCAACTGATGGAAACCAGCCATCGCCATCTGCTGCACGCGGAAGAAGGCACATGGCTGAATATCGACGGTTTCCATATGGGGATTGGTGGCGACGACTCCTGGAGCCCGTCAGTATCGGCGGAATTCCAGCTGAGCGCCGGTCGCTACCATTACCAGTTGGTCTGGTGTCAAAAATAAGCGGCCGCtTTATGTAGGCTGGAGCTGCTTCGAAGTTCCTATACTTTCTAGAGAATAGGAACTTCGGAATAGGAACTTCAAGATCCCCTTATTAGAAGAACTCGTCAAGAAGGCGATAGAAGGCGATGCGCTGCGAATCGGGAGCGGCGATACCGTAAAGCACGAGGAAGCGGTCAGCCCATTCGCCGCCAAGCTCTTCAGCAATATCACGGGTAGCCAACGCTATGTCCTGATAGCGGTCCGCCACACCCAGCCGGCCACAGTCGATGAATCCtGAAAAGCGGCCATTTTCCACCATGATATTCGGCAAGCAGGCATCGCCATGGGTCACGACGAGATCCTCGCCGTCGGGCATGCGCGCCTTGAGCCTGGCGAACAGTTCGGCTGGCGCGAGCCCCTGATGCTCTTCGTCCAGATCATCCTGATCGACAAGACCGGCTTCCATCCGAGTACGTGCTCGCTCGATGCGATGTTTCGCTTGGTGGTCGAATGGGCAGGTAGCCGGATCAAGCGTATGCAGCCGCCGCATTGCATCAGCCATGATGGATACTTTCTCGGCAGGAGCAAGGTGAGATGACAGGAGATCCTGCCCCGGCACTTCGCCCAATAGCAGCCAGTCCCTTCCCGCTTCAGTGACAACGTCGAGCACAGCTGCGCAAGGAACGCCCGTCGTGGCCAGCCACGATAGCCGCGCTGCCTCGTCCTGCAGTTCATTCAGGGCACCGGACAGGTCGGTCTTGACAAAAAGAACCGGGCGCCCCTGCGCTGACAGCCGGAACACGGCGGCATCAGAGCAGCCGATTGTCTGTTGTGCCCAGTCATAGCCGAATAGCCTCTCCACCCAAGCGGCCGGAGAACCTGCGTGCAATCCATCTTGTTCAATCATGCGAAACGATCCTCATCCTGTCTCTTGATCAGATCTTGATCCCCTGCGCCATCAGATCCTTGGCGGCAAGAAAGCCATCCAGTTTACTTTGCAGGGCTTCCCAACCTTACCAGAGGGCGCCCCAGCTGGCAATTCCGGTTCGCTTGCTGTCCATAAAACCGCCCAGTCTAGCTATCGCCATGTAAGCCCACTGCAAGCTACCTGCTTTCTCTTTGCGCTTGCGTTTTCCCTTGTCCAGATAGCCCAGTAGCTGACATTCATCCGGGGTCAGCACCGTTTCTGCGGACTGGCTTTCTACGTGTTCCGCTTCCTTTAGCAGCCCTTGCGCCCTGAGTGCTTGCGGCAGCGTGAGCTTCAAAAGCGCTCTGAAGTTCCTATACTTTCTAGAGAATAGGAACTTCGAACTGCAGGTCGACGGATCCCCGGAATCATGGTTCCTCAGGAAACGTGTTGCTGTGGGCTGCGACGATATGCCCAGACCATCATGATCACACCCGCGACAATCATCGGGATGGAAAGAATTTGCCCCATGCTGATGTACTGCACCCAGGCACCGGTAAACTGCGCGTCGGGCTGGCGGAAAAACTCAACAATGATGCGAAACGCGCCGTAACCAATCAGGAACAAACCTGAGACAGCTCCCATTGGGCGTGGTTTACGAATATACAGGTTGAGGATAATAAACAGCACCACACCTTCCAGCAGCAGCTCGTAAAGCTGTGATGGGTGGCGCGGCAGCACACCGTAAGTGTCGAAAATGGATTGCCACTGCGGGTTGGTTTGCAGCAGCAAAATATCTTCTGTACGGGAGCCAGGGAACAGCATGGCAAACGGGAAGTTCGGGTCAACGCGGCCCCACAATTCACCGTTAATAAAGTTGCCCAGACGCCCGGCACCAAGACCAAACGGAATGAGTGGTGCGATAAAATCAGAGACCTGGAAGAAGGAACGTTTAGTACGGCGGGCGAAGATAATCATCACCACGATAACGCCAATCAGGCCGCCGTGGAAAGACATGCCGCCGTCCCAGACACGGAACAGATACAGCGGATCGGCCATAAACTGCGGGAAATTGTAGAACAGAACATAACCAATACGTCCCCCGAGGAAGACGCCGAGGAAGCCCGCATAGAGTAAGTTTTCAACTTCATTTTTGGTCCAGCCGCTGCCCGGACGATTCGCCCGTCGTGTTGCCAGCCACATTGCAAAAATGAAACCCACCAGATACATCAGGCCGTACCAGTGAAGCGCCACGGGTCCTATTGAGAAAATGACCGGATCAAACTCCGGAAAATGCAGATAGCTACTGGTCATCTGTCACCACAAGTTCTTGTTATTTCGCTGAAAGAGAACAGCGATTGAAATGCGCGCCGCAGGTTTCAGGCGCTCCAAAGGTGCGAATAATAGCACAAGGGGACCTGGCTGGTTGCCGGATACCGTTAAAAGATATGTATASEQ ID NO: 2 >pG292, complete sequence.TCGCGCGTTTCGGTGATGACGGTGAAAACCTCTGACACATGCAGCTCCCGGAGACGGTCACAGCTTGTCTGTAAGCGGATGCCGGGAGCAGACAAGCCCGTCAGGGCGCGTCAGCGGGTGTTGGCGGGTGTCGGGGCTGGCTTAACTATGCGGCATCAGAGCAGATTGTACTGAGAGTGCACCATATATGCGGTGTGAAATACCGCACAGATGCGTAAGGAGAAAATACCGCATCAggcgccTCCTCAACCTGTATATTCGTAAACCACGCCCAATGGGAGCTGTCTCAGGTTTGTTCCTGATTGGTTACGGCGCGTTTCGCATCATTGTTGAGTTTTTCCGCCAGCCCGACGCGCAGTTTACCGGTGCCTGGGTGCAGTACATCAGCATGGGGCAAATTCTTTCCATCCCGATGATTGTCGCGGGTGTGATCATGATGGTCTGGGCATATCGTCGCAGCCCACAGCAACACGTTTCCTGAGGAACCATGAAACAGTATTTAGAACTGATGCAAAAAGTGCTCGACGAAGGCACACAGAAAAACGACCGTACCGGAACCGGAACGCTTTCCATTTTTGGTCATCAGATGCGTTTTAACCTGCAAGATGGATTCCCGCTGGTGACAACTAAACGTTGCCACCTGCGTTCCATCATCCATGAACTGCTGTGGTTTCTGCAGGGCGACACTAACATTGCTTATCTACACGAAAACAATGTCACCATCTGGGACGAATGGGCCGATGAAAACGGCGACCTCGGGCCAGTGTATGGTAAACAGTGGCGCGCCTGGCCAACGCCAGATGGTCGTCATATTGACCAGATCACTACGGTACTGAACCAGCTGAAAAACGACCCGGATTCGCGCCGCATTATTGTTTCAGCGTGGAACGTAGGCGAACTGGATAAAATGGCGCTGGCACCGTGCCATGCATTCTTCCAGTTCTATGTGGCAGACGGCAAACTCTCTTGCCAGCTTTATCAGCGCTCCTGTGACGTCTTCCTCGGCCTGCCGTTCAACATTGCCAGCTACGCGTTATTGGTGCATATGATGGCGCAGCAGTGCGATCTGGAAGTGGGTGATTTTGTCTGGACCGGTGGCGACACGCATCTGTACAGCAACCATATGGATCAAACTCATCTGCAATTAAGCCGCGAACCGCGTCCGCTGCCGAAGTTGATTATCAAACGTAAACCCGAATCCATCTTCGACTACCGTTTCGAAGACTTTGAGATTGAAGGCTACGATCCGCATCCGGGCATTAAAGCGCCGGTGGCTATCTAATTACGAAACATCCTGCCAGAGCCGACGCCAGTGTGCGTCGGTTTTTTTACCCTCCGTTAAATTCTTCGAGACGCCTTCCCGAAggcgccATTCGCCATTCAGGCTGCGCAACTGTTGGGAAGGGCGATCGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGGATGTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGGTTTTCCCAGTCACGACGTTGTAAAACGACGGCCAGTGCCAAGCTTACTGCTCACAAGAAAAAAGGCACGTCATCTGACGTGCCTTTTTTATTTGTACTACCCTGTACGATTACTGCAGGTCGACTCTAGATGCATGCTCGAGTCAACGGTTTTTCAGCAATCGGTGCAAAATGCCGAAGTATTGCCTCAAGGTAAACAGCCGCCGCATCCTGCCGTCTGCCGCAAAATCCAGCCACGCGCCGGCGGGCAGCGTGTCCGTCCGTTTGAAGCATTGGTACAAAAACCGGCGGGCGCGTTCAAAATCTTCTTCCGGCAAATGTTTCTCCAGCAATTCATACGCTACTGCTTTTATTTGGCGGTATTCAAGGCTGTCGAACCGGGTTTTAAAACCCATAGACTGCAAAAAATCGTTTCTGGCGGTTTTTTGGATGCCTTGCGCGATTTCGTGTTGGCGGATGCTGTATTTGGATGAAACCTGATTGGCGTGAAGGCGGTATTTGACCAAGGCTTCGGGATAATAAGCCAGCCTGCCCAATTTGCTGACATCGTACCAAAATTGGTAATCTTCCGCCCAATCCCGCTCGGTGTTGTAACGCAAACCGCCGTCAATGACGCTGCGCCTCATAATCATCGTGTTGTTGTGTATGGGGTTGCCGAAAGGGAAAAAGTCGGCAATGTCTTCGTGTCGGGTCGGTTTTTTCCAAATTTTGCCGTGTTCGTGGTGCCGCGCCAGCCGGTTGCCGTCCTTTTCTTCCGACAAAACTTCCAGCCACGCACCCATCGCGATGATGCTGCGGTCTTTTTCCATCTCACCCACGATTTTCTCAATCCAGTCGGGGGCGGCAATATCGTCTGCATCGGTGCGCGCAATATATTCCCCCCCCCCCCCCGACTTTGCCAATTCATCCAGCCCGATGTTTAAAGAGGGAATCAGACCGGAATTGCGCGGCTGCGCGAGGATGCGGATGCGGCCGTCCTGTTCTTGGAAACGCTGGGCAATGGCAAGCGTACCGTCCGTCGAGCCGTCATCGACAATCAAAATATCCAAGTTGCGCCAAGTTTGATTCACGACGGCGGCTAATGATTGGGCGAAATATTTTTCTACGTTGTAGGCGCAAATCAATACGCTGACTAAAGGCTGCAATTTATTCTCCCGATAGGCACGATGCCGTCTGAAGGCTTCAGACGGCATATGtatatctccttcttgaaTTCTAACAATTGATTGAATGTATGCAAATAAATGCATACACCATAGGTGTGGTTTAATTTGATGCCCTTTTTCAGGGCTGGAATGTGTAAGAGCGGGGTTATTTATGCTGTTGTTTTTTTGTTACTCGGGAAGGGCTTTACCTCTTCCGCATAAACGCTTCCATCAGCGTTTATAGTTAAAAAAATCTTTCGGAACTGGTTTTGCGCTTACCCCAACCAACAGGGGATTTGCTGCTTTCCATTGAGCCTGTTTCTCTGCGCGACGTTCGCGGCGGCGTGTTTGTGCATCCATCTGGATTCTCCTGTCAGTTAGCTTTGGTGGTGTGTGGCAGTTGTAGTCCTGAACGAAAACCCCCCGCGATTGGCACATTGGCAGCTAATCCGGAATCGCACTTACGGCCAATGCTTCGTTTCGTATCACACACCCCAAAGCCTTCTGCTTTGAATGCTGCCCTTCTTCAGGGCTTAATTTTTAAGAGCGTCACCTTCATGGTGGTCAGTGCGTCCTGCTGATGTGCTCAGTATCACCGCCAGTGGTATTTATGTCAACACCGCCAGAGATAATTTATCACCGCAGATGGTTATCTGTATGTTTTTTATATGAATTTATTTTTTGCAGGGGGGCATTGTTTGGTAGGTGAGAGATCAATTCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGGACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCAATACGGGATAATACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAAGGGAATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCCACCTGACGTCTAAGAAACCATTATTATCATGACATTAACCTATAAAAATAGGCGTATCACGAGGC CCTTTCGTCSEQ ID NO: 3 >pG221, complete sequence.TCGCGCGTTTCGGTGATGACGGTGAAAACCTCTGACACATGCAGCTCCCGGAGACGGTCACAGCTTGTCTGTAAGCGGATGCCGGGAGCAGACAAGCCCGTCAGGGCGCGTCAGCGGGTGTTGGCGGGTGTCGGGGCTGGCTTAACTATGCGGCATCAGAGCAGATTGTACTGAGAGTGCACCATATATGCGGTGTGAAATACCGCACAGATGCGTAAGGAGAAAATACCGCATCAggcgccTCCTCAACCTGTATATTCGTAAACCACGCCCAATGGGAGCTGTCTCAGGTTTGTTCCTGATTGGTTACGGCGCGTTTCGCATCATTGTTGAGTTTTTCCGCCAGCCCGACGCGCAGTTTACCGGTGCCTGGGTGCAGTACATCAGCATGGGGCAAATTCTTTCCATCCCGATGATTGTCGCGGGTGTGATCATGATGGTCTGGGCATATCGTCGCAGCCCACAGCAACACGTTTCCTGAGGAACCATGAAACAGTATTTAGAACTGATGCAAAAAGTGCTCGACGAAGGCACACAGAAAAACGACCGTACCGGAACCGGAACGCTTTCCATTTTTGGTCATCAGATGCGTTTTAACCTGCAAGATGGATTCCCGCTGGTGACAACTAAACGTTGCCACCTGCGTTCCATCATCCATGAACTGCTGTGGTTTCTGCAGGGCGACACTAACATTGCTTATCTACACGAAAACAATGTCACCATCTGGGACGAATGGGCCGATGAAAACGGCGACCTCGGGCCAGTGTATGGTAAACAGTGGCGCGCCTGGCCAACGCCAGATGGTCGTCATATTGACCAGATCACTACGGTACTGAACCAGCTGAAAAACGACCCGGATTCGCGCCGCATTATTGTTTCAGCGTGGAACGTAGGCGAACTGGATAAAATGGCGCTGGCACCGTGCCATGCATTCTTCCAGTTCTATGTGGCAGACGGCAAACTCTCTTGCCAGCTTTATCAGCGCTCCTGTGACGTCTTCCTCGGCCTGCCGTTCAACATTGCCAGCTACGCGTTATTGGTGCATATGATGGCGCAGCAGTGCGATCTGGAAGTGGGTGATTTTGTCTGGACCGGTGGCGACACGCATCTGTACAGCAACCATATGGATCAAACTCATCTGCAATTAAGCCGCGAACCGCGTCCGCTGCCGAAGTTGATTATCAAACGTAAACCCGAATCCATCTTCGACTACCGTTTCGAAGACTTTGAGATTGAAGGCTACGATCCGCATCCGGGCATTAAAGCGCCGGTGGCTATCTAATTACGAAACATCCTGCCAGAGCCGACGCCAGTGTGCGTCGGTTTTTTTACCCTCCGTTAAATTCTTCGAGACGCCTTCCCGAAggcgccATTCGCCATTCAGGCTGCGCAACTGTTGGGAAGGGCGATCGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGGATGTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGGTTTTCCCAGTCACGACGTTGTAAAACGACGGCCAGTGCCAAGCTTACTGCTCACAAGAAAAAAGGCACGTCATCTGACGTGCCTTTTTTATTTGTACTACCCTGTACGATTACTGCAGGTCGACTCTAGATGCATGCTCGAGTTATTATTTAATATATTTACAATAGATGAAGGACGCAATCGTACGGATACCGCCGAACAGGTAGTTAATGTTACCGGTCAGGAAGAAGCACTTCATTTTGATAACCAGGTCGTTAACCATCACCATGTACAGGTTTTTTTTTGCGGTAGACTGACCTTCGTGCAGGCGGTAGTAGAACAGGTATTCCGGCAGGTTTTGGAACTTGATTTTTGCCAGGCTCAGACGGTTCCACAGCTCGTAATCTTCGGAGTAGTTAGAAAACATATAACCACCGATGCTCGCGATGACTTTTTTACGAAACATTACGCTCGGGTGAACAATACAACACTTATACGGCAGGTTTTTAACGATGTCCAGGTTCTCTTCCGGCAGTTTGGTCTTGTTGATTTCACGACCTTTGTCGTCAATAAAGATTGCGTTGGTACCCACAACATCTACGTACGGATTGTTCTTCAGGAAGTCAACCTGTTTAGTAAAACGGTCCGGGTGAGAGATGTCGTCAGAGTCCATACGGGCAATAAATTCGCCGTTGCTCAGGTCGATCGCTTTGTTCAGGGAGTACGGCAGGTAAGCGATGTTAGTGCGGATCAGTTTGATTTTGTCGTTAACTTTGTGTTTCAGTTCGTTATAGAAGTCGTCAGTGCAGCAGTTCGCAACGATGATGATTTCGAAGCTGCTGAAGGTCTGAGACAGGATGCTGTTGATCGCTTCGTCCAGAAAAGGGTTTTTCTTGTTAACAGGCAGGATAACGCTCACAACCGGGTGGGTAGATTCCGCGGATTCCGCTTCATCGATGATCATATGTATATCTCCTTCTTCTCGAGTCAACGGTTTTTCAGCAATCGGTGCAAAATGCCGAAGTATTGCCTCAAGGTAAACAGCCGCCGCATCCTGCCGTCTGCCGCAAAATCCAGCCACGCGCCGGCGGGCAGCGTGTCCGTCCGTTTGAAGCATTGGTACAAAAACCGGCGGGCGCGTTCAAAATCTTCTTCCGGCAAATGTTTCTCCAGCAATTCATACGCTACTGCTTTTATTTGGCGGTATTCAAGGCTGTCGAACCGGGTTTTAAAACCCATAGACTGCAAAAAATCGTTTCTGGCGGTTTTTTGGATGCCTTGCGCGATTTCGTGTTGGCGGATGCTGTATTTGGATGAAACCTGATTGGCGTGAAGGCGGTATTTGACCAAGGCTTCGGGATAATAAGCCAGCCTGCCCAATTTGCTGACATCGTACCAAAATTGGTAATCTTCCGCCCAATCCCGCTCGGTGTTGTAACGCAAACCGCCGTCAATGACGCTGCGCCTCATAATCATCGTGTTGTTGTGTATGGGGTTGCCGAAAGGGAAAAAGTCGGCAATGTCTTCGTGTCGGGTCGGTTTTTTCCAAATTTTGCCGTGTTCGTGGTGCCGCGCCAGCCGGTTGCCGTCCTTTTCTTCCGACAAAACTTCCAGCCACGCACCCATCGCGATGATGCTGCGGTCTTTTTCCATCTCACCCACGATTTTCTCAATCCAGTCGGGGGCGGCAATATCGTCTGCATCGGTGCGCGCAATATATTCCCCCCCCCCCCCCGACTTTGCCAATTCATCCAGCCCGATGTTTAAAGAGGGAATCAGACCGGAATTGCGCGGCTGCGCGAGGATGCGGATGCGGCCGTCCTGTTCTTGGAAACGCTGGGCAATGGCAAGCGTACCGTCCGTCGAGCCGTCATCGACAATCAAAATATCCAAGTTGCGCCAAGTTTGATTCACGACGGCGGCTAATGATTGGGCGAAATATTTTTCTACGTTGTAGGCGCAAATCAATACGCTGACTAAAGGCTGCAATTTATTCTCCCGATAGGCACGATGCCGTCTGAAGGCTTCAGACGGCATATGtatatctccttcttgaaTTCTAACAATTGATTGAATGTATGCAAATAAATGCATACACCATAGGTGTGGTTTAATTTGATGCCCTTTTTCAGGGCTGGAATGTGTAAGAGCGGGGTTATTTATGCTGTTGTTTTTTTGTTACTCGGGAAGGGCTTTACCTCTTCCGCATAAACGCTTCCATCAGCGTTTATAGTTAAAAAAATCTTTCGGAACTGGTTTTGCGCTTACCCCAACCAACAGGGGATTTGCTGCTTTCCATTGAGCCTGTTTCTCTGCGCGACGTTCGCGGCGGCGTGTTTGTGCATCCATCTGGATTCTCCTGTCAGTTAGCTTTGGTGGTGTGTGGCAGTTGTAGTCCTGAACGAAAACCCCCCGCGATTGGCACATTGGCAGCTAATCCGGAATCGCACTTACGGCCAATGCTTCGTTTCGTATCACACACCCCAAAGCCTTCTGCTTTGAATGCTGCCCTTCTTCAGGGCTTAATTTTTAAGAGCGTCACCTTCATGGTGGTCAGTGCGTCCTGCTGATGTGCTCAGTATCACCGCCAGTGGTATTTATGTCAACACCGCCAGAGATAATTTATCACCGCAGATGGTTATCTGTATGTTTTTTATATGAATTTATTTTTTGCAGGGGGGCATTGTTTGGTAGGTGAGAGATCAATTCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGGACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCAATACGGGATAATACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAAGGGAATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCCACCTGACGTCTAAGAAACCATTATTATCATGACATTAACCTATAAAAATAGGCGTATCACGAGGCCCTTTCGTCSEQ ID NO: 4 >pG222, complete sequence.TCGCGCGTTTCGGTGATGACGGTGAAAACCTCTGACACATGCAGCTCCCGGAGACGGTCACAGCTTGTCTGTAAGCGGATGCCGGGAGCAGACAAGCCCGTCAGGGCGCGTCAGCGGGTGTTGGCGGGTGTCGGGGCTGGCTTAACTATGCGGCATCAGAGCAGATTGTACTGAGAGTGCACCATATATGCGGTGTGAAATACCGCACAGATGCGTAAGGAGAAAATACCGCATCAggcgccTCCTCAACCTGTATATTCGTAAACCACGCCCAATGGGAGCTGTCTCAGGTTTGTTCCTGATTGGTTACGGCGCGTTTCGCATCATTGTTGAGTTTTTCCGCCAGCCCGACGCGCAGTTTACCGGTGCCTGGGTGCAGTACATCAGCATGGGGCAAATTCTTTCCATCCCGATGATTGTCGCGGGTGTGATCATGATGGTCTGGGCATATCGTCGCAGCCCACAGCAACACGTTTCCTGAGGAACCATGAAACAGTATTTAGAACTGATGCAAAAAGTGCTCGACGAAGGCACACAGAAAAACGACCGTACCGGAACCGGAACGCTTTCCATTTTTGGTCATCAGATGCGTTTTAACCTGCAAGATGGATTCCCGCTGGTGACAACTAAACGTTGCCACCTGCGTTCCATCATCCATGAACTGCTGTGGTTTCTGCAGGGCGACACTAACATTGCTTATCTACACGAAAACAATGTCACCATCTGGGACGAATGGGCCGATGAAAACGGCGACCTCGGGCCAGTGTATGGTAAACAGTGGCGCGCCTGGCCAACGCCAGATGGTCGTCATATTGACCAGATCACTACGGTACTGAACCAGCTGAAAAACGACCCGGATTCGCGCCGCATTATTGTTTCAGCGTGGAACGTAGGCGAACTGGATAAAATGGCGCTGGCACCGTGCCATGCATTCTTCCAGTTCTATGTGGCAGACGGCAAACTCTCTTGCCAGCTTTATCAGCGCTCCTGTGACGTCTTCCTCGGCCTGCCGTTCAACATTGCCAGCTACGCGTTATTGGTGCATATGATGGCGCAGCAGTGCGATCTGGAAGTGGGTGATTTTGTCTGGACCGGTGGCGACACGCATCTGTACAGCAACCATATGGATCAAACTCATCTGCAATTAAGCCGCGAACCGCGTCCGCTGCCGAAGTTGATTATCAAACGTAAACCCGAATCCATCTTCGACTACCGTTTCGAAGACTTTGAGATTGAAGGCTACGATCCGCATCCGGGCATTAAAGCGCCGGTGGCTATCTAATTACGAAACATCCTGCCAGAGCCGACGCCAGTGTGCGTCGGTTTTTTTACCCTCCGTTAAATTCTTCGAGACGCCTTCCCGAAggcgccATTCGCCATTCAGGCTGCGCAACTGTTGGGAAGGGCGATCGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGGATGTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGGTTTTCCCAGTCACGACGTTGTAAAACGACGGCCAGTGCCAAGCTTACTGCTCACAAGAAAAAAGGCACGTCATCTGACGTGCCTTTTTTATTTGTACTACCCTGTACGATTACTGCAGGTCGACTCTAGATGCATGctcgagTTATACAAACTGCCAATATTTCAAATATTTAAAATGGAGTTCTCTCATTAAGGCGATTTTAGGGCTATAAGGTTCTTCTTTTCGTGCTATCGTAGAGATTTGCTCATCATCAGCGATCACAAAAGGTTGTAACACCAGATTTTTCACGCCATGGATAAAAGTAGCGTCCATTATCGTATCCACAGGAACAACCCATTTTCGGCTGCATTTCAAAAAAACTTTGGCAATCTTAGGCGTGATCACATAGCCTTGAGTCCCCACCCCTTCGCTATAAGCTTTAATGATCCCCACACGCTCTTGTATCTCGTGGTTTTTATGGCTCAATGGCTCACTTTTTACACTGGCATCATACAATAAATGCATCAAGCGGATATAGCCTAACTCTTGGATGTGTTTTTCTAAAAAATCCAAGCCCTCTTTAAAATCCTCTTTCAAGGTTATATCGTCTTCTAAAATACAGATCGCTTCATTGAGTTCTATGCATTTTTCCCACAAGGAATAATGACTCGCATAGCACCCAAGCTCCCCCAAGCTCATAAACTTCGCATGGTATTTTAAAGCGTAATAAAACTTAGAAACCTCACTGATGAGATTGGTTGTAATCCCCATGTCTTTGATGTTTTGCGTGATGAAATAAGGGTGTAAATGCTTTTTCACTAAGGGGTGCAACCCGCCTTCAAAAGTTTTAGAATAAATCGCATCAAAAATTTGCGCTTGGTGGTGGGTGGCATTGATGCTATTGAGTAAAGTTGTGGTGTCTCTAAAAACTAAACCAAATGTATCGCACACTTTTTGATTTAAAGAAATGGCAAAAACACGCAtATGtatatctccttcttCTCGAGTCAACGGTTTTTCAGCAATCGGTGCAAAATGCCGAAGTATTGCCTCAAGGTAAACAGCCGCCGCATCCTGCCGTCTGCCGCAAAATCCAGCCACGCGCCGGCGGGCAGCGTGTCCGTCCGTTTGAAGCATTGGTACAAAAACCGGCGGGCGCGTTCAAAATCTTCTTCCGGCAAATGTTTCTCCAGCAATTCATACGCTACTGCTTTTATTTGGCGGTATTCAAGGCTGTCGAACCGGGTTTTAAAACCCATAGACTGCAAAAAATCGTTTCTGGCGGTTTTTTGGATGCCTTGCGCGATTTCGTGTTGGCGGATGCTGTATTTGGATGAAACCTGATTGGCGTGAAGGCGGTATTTGACCAAGGCTTCGGGATAATAAGCCAGCCTGCCCAATTTGCTGACATCGTACCAAAATTGGTAATCTTCCGCCCAATCCCGCTCGGTGTTGTAACGCAAACCGCCGTCAATGACGCTGCGCCTCATAATCATCGTGTTGTTGTGTATGGGGTTGCCGAAAGGGAAAAAGTCGGCAATGTCTTCGTGTCGGGTCGGTTTTTTCCAAATTTTGCCGTGTTCGTGGTGCCGCGCCAGCCGGTTGCCGTCCTTTTCTTCCGACAAAACTTCCAGCCACGCACCCATCGCGATGATGCTGCGGTCTTTTTCCATCTCACCCACGATTTTCTCAATCCAGTCGGGGGCGGCAATATCGTCTGCATCGGTGCGCGCAATATATTCCCCCCCCCCCCCCGACTTTGCCAATTCATCCAGCCCGATGTTTAAAGAGGGAATCAGACCGGAATTGCGCGGCTGCGCGAGGATGCGGATGCGGCCGTCCTGTTCTTGGAAACGCTGGGCAATGGCAAGCGTACCGTCCGTCGAGCCGTCATCGACAATCAAAATATCCAAGTTGCGCCAAGTTTGATTCACGACGGCGGCTAATGATTGGGCGAAATATTTTTCTACGTTGTAGGCGCAAATCAATACGCTGACTAAAGGCTGCAATTTATTCTCCCGATAGGCACGATGCCGTCTGAAGGCTTCAGACGGCATATGtatatctccttcttgaaTTCTAACAATTGATTGAATGTATGCAAATAAATGCATACACCATAGGTGTGGTTTAATTTGATGCCCTTTTTCAGGGCTGGAATGTGTAAGAGCGGGGTTATTTATGCTGTTGTTTTTTTGTTACTCGGGAAGGGCTTTACCTCTTCCGCATAAACGCTTCCATCAGCGTTTATAGTTAAAAAAATCTTTCGGAACTGGTTTTGCGCTTACCCCAACCAACAGGGGATTTGCTGCTTTCCATTGAGCCTGTTTCTCTGCGCGACGTTCGCGGCGGCGTGTTTGTGCATCCATCTGGATTCTCCTGTCAGTTAGCTTTGGTGGTGTGTGGCAGTTGTAGTCCTGAACGAAAACCCCCCGCGATTGGCACATTGGCAGCTAATCCGGAATCGCACTTACGGCCAATGCTTCGTTTCGTATCACACACCCCAAAGCCTTCTGCTTTGAATGCTGCCCTTCTTCAGGGCTTAATTTTTAAGAGCGTCACCTTCATGGTGGTCAGTGCGTCCTGCTGATGTGCTCAGTATCACCGCCAGTGGTATTTATGTCAACACCGCCAGAGATAATTTATCACCGCAGATGGTTATCTGTATGTTTTTTATATGAATTTATTTTTTGCAGGGGGGCATTGTTTGGTAGGTGAGAGATCAATTCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGGACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCAATACGGGATAATACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAAGGGAATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCCACCTGACGTCTAAGAAACCATTATTATCATGACATTAACCTATAAAAATAGGCGTATCACGAGGCCCTTTCGTC SEQ ID NO: 5 >pG317, complete sequence.GTACCCAGCTTTTGTTCCCTTTAGTGAGGGTTAATTGCGCGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAATTCCACACAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAACTCACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCATGCATAAAAACTGTTGTAATTCATTAAGCATTCTGCCGACATGGAAGCCATCACAAACGGCATGATGAACCTGAATCGCCAGCGGCATCAGCACCTTGTCGCCTTGCGTATAATATTTGCCCATGGTGAAAACGGGGGCGAAGAAGTTGTCCATATTGGCCACGTTTAAATCAAAACTGGTGAAACTCACCCAGGGATTGGCTGAGACGAAAAACATATTCTCAATAAACCCTTTAGGGAAATAGGCCAGGTTTTCACCGTAACACGCCACATCTTGCGAATATATGTGTAGAAACTGCCGGAAATCGTCGTGGTATTCACTCCAGAGCGATGAAAACGTTTCAGTTTGCTCATGGAAAACGGTGTAACAAGGGTGAACACTATCCCATATCACCAGCTCACCGTCTTTCATTGCCATACGGAATTCCGGATGAGCATTCATCAGGCGGGCAAGAATGTGAATAAAGGCCGGATAAAACTTGTGCTTATTTTTCTTTACGGTCTTTAAAAAGGCCGTAATATCCAGCTGAACGGTCTGGTTATAGGTACATTGAGCAACTGACTGAAATGCCTCAAAATGTTCTTTACGATGCCATTGGGATATATCAACGGTGGTATATCCAGTGATTTTTTTCTCCATTTTAGCTTCCTTAGCTCCTGAAAATCTCGATAACTCAAAAAATACGCCCGGTAGTGATCTTATTTCATTATGGTGAAAGTTGGAACCTCTTACGTGCCGATCAACGTCTCATTTTCGCCAAAAGTTGGCCCAGGGCTTCCCGGTATCAACAGGGACACCAGGATTTATTTATTCTGCGAAGTGATCTTCCGTCACAGGTATTTATTCGAAGACGAAAGGGCCTCGTGATACGCCTATTTTTATAGGTTAATGTCATGATAATAATGGTTTCTTAGACGTCAGGTGGCACTTTTCGGGGAAATGTGCGCGCCCGCGTTCCTGCTGGCGCTGGGCCTGTTTCTGGCGCTGGACTTCCCGCTGTTCCGTCAGCAGCTTTTCGCCCACGGCCTTGATGATCGCGGCGGCCTTGGCCTGCATATCCCGATTCAACGGCCCCAGGGCGTCCAGAACGGGCTTCAGGCGCTCCCGAAGGTCTCGGGCCGTCTCTTGGGCTTGATCGGCCTTCTTGCGCATCTCACGCGCTCCTGCGGCGGCCTGTAGGGCAGGCTCATACCCCTGCCGAACCGCTTTTGTCAGCCGGTCGGCCACGGCTTCCGGCGTCTCAACGCGCTTTGAGATTCCCAGCTTTTCGGCCAATCCCTGCGGTGCATAGGCGCGTGGCTCGACCGCTTGCGGGCTGATGGTGACGTGGCCCACTGGTGGCCGCTCCAGGGCCTCGTAGAACGCCTGAATGCGCGTGTGACGTGCCTTGCTGCCCTCGATGCCCCGTTGCAGCCCTAGATCGGCCACAGCGGCCGCAAACGTGGTCTGGTCGCGGGTCATCTGCGCTTTGTTGCCGATGAACTCCTTGGCCGACAGCCTGCCGTCCTGCGTCAGCGGCACCACGAACGCGGTCATGTGCGGGCTGGTTTCGTCACGGTGGATGCTGGCCGTCACGATGCGATCCGCCCCGTACTTGTCCGCCAGCCACTTGTGCGCCTTCTCGAAGAACGCCGCCTGCTGTTCTTGGCTGGCCGACTTCCACCATTCCGGGCTGGCCGTCATGACGTACTCGACCGCCAACACAGCGTCCTTGCGCCGCTTCTCTGGCAGCAACTCGCGCAGTCGGCCCATCGCTTCATCGGTGCTGCTGGCCGCCCAGTGCTCGTTCTCTGGCGTCCTGCTGGCGTCAGCGTTGGGCGTCTCGCGCTCGCGGTAGGCGTGCTTGAGACTGGCCGCCACGTTGCCCATTTTCGCCAGCTTCTTGCATCGCATGATCGCGTATGCCGCCATGCCTGCCCCTCCCTTTTGGTGTCCAACCGGCTCGACGGGGGCAGCGCAAGGCGGTGCCTCCGGCGGGCCACTCAATGCTTGAGTATACTCACTAGACTTTGCTTCGCAAAGTCGTGACCGCCTACGGCGGCTGCGGCGCCCTACGGGCTTGCTCTCCGGGCTTCGCCCTGCGCGGTCGCTGCGCTCCCTTGCCAGCCCGTGGATATGTGGACGATGGCCGCGAGCGGCCACCGGCTGGCTCGCTTCGCTCGGCCCGTGGACAACCCTGCTGGACAAGCTGATGGACAGGCTGCGCCTGCCCACGAGCTTGACCACAGGGATTGCCCACCGGCTACCCAGCCTTCGACCACATACCCACCGGCTCCAACTGCGCGGCCTGCGGCCTTGCCCCATCAATTTTTTTAATTTTCTCTGGGGAAAAGCCTCCGGCCTGCGGCCTGCGCGCTTCGCTTGCCGGTTGGACACCAAGTGGAAGGCGGGTCAAGGCTCGCGCAGCGACCGCGCAGCGGCTTGGCCTTGACGCGCCTGGAACGACCCAAGCCTATGCGAGTGGGGGCAGTCGAAGGCGAAGCCCGCCCGCCTGCCCCCCGAGCCTCACGGCGGCGAGTGCGGGGGTTCCAAGGGGGCAGCGCCACCTTGGGCAAGGCCGAAGGCCGCGCAGTCGATCAACAAGCCCCGGAGGGGCCACTTTTTGCCGGAGGGGGAGCCGCGCCGAAGGCGTGGGGGAACCCCGCAGGGGTGCCCTTCTTTGGGCACCAAAGAACTAGATATAGGGCGAAATGCGAAAGACTTAAAAATCAACAACTTAAAAAAGGGGGGTACGCAACAGCTCATTGCGGCACCCCCCGCAATAGCTCATTGCGTAGGTTAAAGAAAATCTGTAATTGACTGCCACTTTTACGCAACGCATAATTGTTGTCGCGCTGCCGAAAAGTTGCAGCTGATTGCGCATGGTGCCGCAACCGTGCGGCACCCTACCGCATGGAGATAAGCATGGCCACGCAGTCCAGAGAAATCGGCATTCAAGCCAAGAACAAGCCCGGTCACTGGGTGCAAACGGAACGCAAAGCGCATGAGGCGTGGGCCGGGCTTATTGCGAGGAAACCCACGGCGGCAATGCTGCTGCATCACCTCGTGGCGCAGATGGGCCACCAGAACGCCGTGGTGGTCAGCCAGAAGACACTTTCCAAGCTCATCGGACGTTCTTTGCGGACGGTCCAATACGCAGTCAAGGACTTGGTGGCCGAGCGCTGGATCTCCGTCGTGAAGCTCAACGGCCCCGGCACCGTGTCGGCCTACGTGGTCAATGACCGCGTGGCGTGGGGCCAGCCCCGCGACCAGTTGCGCCTGTCGGTGTTCAGTGCCGCCGTGGTGGTTGATCACGACGACCAGGACGAATCGCTGTTGGGGCATGGCGACCTGCGCCGCATCCCGACCCTGTATCCGGGCGAGCAGCAACTACCGACCGGCCCCGGCGAGGAGCCGCCCAGCCAGCCCGGCATTCCGGGCATGGAACCAGACCTGCCAGCCTTGACCGAAACGGAGGAATGGGAACGGCGCGGGCAGCAGCGCCTGCCGATGCCCGATGAGCCGTGTTTTCTGGACGATGGCGAGCCGTTGGAGCCGCCGACACGGGTCACGCTGCCGCGCCGGTAGCACTTGGGTTGCGCAGCAACCCGTAAGTGCGCTGTTCCAGACTATCGGCTGTAGCCGCCTCGCCGCCCTATACCTTGTCTGCCTCCCCGCGTTGCGTCGCGGTGCATGGAGCCGGGCCACCTCGACCTGAATGGAAGCCGGCGGCACCTCGCTAACGGATTCACCGTTTTTATCAGGCTCTGGGAGGCAGAATAAATGATCATATCGTCAATTATTACCTCCACGGGGAGAGCCTGAGCAAACTGGCCTCAGGCATTTGAGAAGCACACGGTCACACTGCTTCCGGTAGTCAATAAACCGGTAAACCAGCAATAGACATAAGCGGCTATTTAACGACCCTGCCCTGAACCGACGACCGGGTCGAATTTGCTTTCGAATTTCTGCCATTCATCCGCTTATTATCACTTATTCAGGCGTAGCACCAGGCGTTTAAGGGCACCAATAACTGCCTTAAAAAAATTACGCCCCGCCCTGCCACTCATCGCAGTCGGCCTATTGGTTAAAAAATGAGCTGATTTAACAAAAATTTAACGCGAATTTTAACAAAATATTAACGCTTACAATTTCCATTCGCCATTCAGGCTGCGCAACTGTTGGGAAGGGCGATCGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGGATGTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGGTTTTCCCAGTCACGACGTTGTAAAACGACGGCCAGTGAGCGCGCGTAATACGACTCACTATAGGGCGAATTGGAGCTCCACCGCGGTGGCGGCCGCTCTAGAACTAGTGGATCCCCCGGGCTGCAGGAATTCGATATCAAGCTTATCGATACCGTCGACCTCGAGTTAAGTCTCTAATCGATTGTTTTCCAATGGAATGGTTATAAAATCTTTGGTTTTTAGTCTTGAAAATCTTCTAGGATTTTCTATGTAAGTTTTTGTATAAATATTATATTGCTTTAATAAATTTAATATATTTTTATTGCATTTTAAGGTTATTTTTTCCATATCTGTTCAACCTTTTTTAAATCCTCCAAACAGTCAATATCTAAACTTGAGCTTTCGTCCATTAAAAAATGCTTGGTTTTGCTTTGTAAAAAGCTAGGATTGTTTAAAAATTCTTTTATCTTTAAAATATAAATTGCACCATTGCTCATATAAGTTTTAGGCAATTTTTGCCTTGGCATAAAAGGATATTCATCATTACAAATCCCTGCTAAATCGCCACAATCATTACAAACAAAGGCTTTTAGAATTTTATTATCACATTCGCTTACGCTAATTAGGGCATTTGCATTGCTATTTTTATAAAGATTAAAAGCTTCATTAATATGAATATTTGTTCTTAGCGGTGAAGTGGGTTGTAAAAAAACTACATCTTCATAATCTTTATAAAATTTTAGAGCATGTAACAGCACTTTATCGCTTGTGGTATCATCTTGTGCAAGGCTAATTGGGCGTTTTAAAATATCAACATTTTGACTTTTTGCATAATTTAAAATTTCATCACTATCACTGCTTACAACAACTTTACTAATGCTTTTAGCATTTAGTGCAGCTTTGATCGTGTAGTAAATTAAAGGTTTATTGTTTAATAAAACCAAATTTTTATTTTTAATACCCTTTGAGCCACCACGAGCAGGGATTATTGCTAAGCTCATTTTATATCCTTAAAAACTTTTTGTGTGCTGAGTTTAAAAAAATCTCCGCTTTGTAAATATTCAAAAAATAATTTTGAGCTATCTAAAATCTCTAACTTAGCGCTAAATAAATCTTGTTTTTTATGAATAGTGTTAATAGCTTTTAGTATTTCATCACTATTTGCATTAACTTTTAGTGTATTTTCATTGCCAAGTCTTCCATTTTGTCTTGAGCCAACTAAAATCCCTGCTGTTTTTAAGTATAAGGCCTCTTTTAAAATACAACTTGAATTACCTATTATAAAATCAGCATTTTTTAACAAAGTTATAAAATACTCAAATCTAAGCGATGGAAAAAGCTTAAATCTAGGGTTATTTTTAAACTCTTCATAGCTTTGCAAGATTAATTCAAAACCTAAATCATTATTTGGATAAATAACAATATAATTTTTATTACTTTGTATCAGTGCTTTTACTAAATTGTCTGCTTGATTTTTAATGCTAGTAATTTCAGTTGTAACAGGATGAAACATAAGCAAAGCGTAGTTTTCATAATTTATATCATAATATTTTTTTGCTTCGCTAAGTGAAATTTTATTATCGTTTAAAAGTTCTAAATCAGGCGAACCTATGATAAAAATAGATTTTTCATCTTCTCCAAGCTGCATTAAACGCCTTTTTGCAAACTCATCATTTACTAAATGAATATGAGCTAGTTTTGATATAGCGTGGCGTAAGCTATCGTCAATAGTTCCTGAAATCTCTCCGCCTTCAATATGCGCTACTAAGATATTATTTAATGCTCCAACAATAGCTGCTGCTAAAGGCTCAATTCTATCTCCATGTACTACGATTAAATCAGGTTTTAGCTCATTTGCATACCTTGAAAATCCATCAATTGTAGTAGCTAAAGCCTTATCAGTTTGATAATATTTATCATAATTTATAAATTCATAAATATTTTTAAAGCCATTTTTATAAAGTTCTTTAACTGTATAGCCAAAATTTTTACTTAAGTGCATTCCTGTTGCAAAGATGTAAAGTTCAAATTCGCTTGAGTTTTGCACCCTGTACATTAAAGATTTAATCTTAGAATAATCAGCCCTAGAGCCTGTTATAAAAAGGATTTTTTTCACGCAAAATCCTCATAGCTTAACTGAGCATCATTTTCTATATCTCTTAATGCTTTTTTGCCTAAAATATTTTCAAATTCAGCCGCACTAATTCCACCAAGTCCAGGTCTTTTAACCCAAATATTATCCATAGATAAAACTTCGCCTTTTTTAATATCTTTAATGCTAACTACACTTGCAAAGGCAAAATCAATTGTAACTTGTTCTTGTTTAGCCGCTTTTTTACTTTCATTATTTCCTCTTATTATAGCCATTTGCTCACTTTGTATAATTAGCTCTTTTAAAGCCTTTGTATCCATAGAACAAACTATATCAGGGCCACTTCTATGCATACTATCAGTAAAATGTCTTTCAAGCACACAAGCTCCAAGTACAACTGCACCTAAACACGCAAGATTATCTGTTGTGTGGTCGCTTAAGCCTACCATACAAGAAAATTCTTTTTTTAACTCAAGCATAGCGTTTAATCTTACAAGATTATGCGGGGTTGGGTAAAGATTGGTCGTGTGCATTAAAACAAAAGGAATTTCATTGTCTAATAAGATTTTTACAGTTGGTTTTATACTTTCAATACTATTCATTCCTGTGCTAACTATCATAGGCTTTTTAAAGGCTGCTATGTGTTTAATAAGCGGATAATTATTACACTCACCTGAACCAATCTTAAAAGCACTAACTCCCATATCTTCTAAGCGGTTCGCACCTGCACGAGAAAAAGGTGTGCTAAGATAAACAAGACCTAATTTTTCTGTGTATTCTTTAAGTGCTAGCTCATCTTTATAATCCAAAGCACATTTTTGCATAATCTCATAAATGCTTATTTTTGCATTACCAGGAATTACTTTTTTAGCGGCCTTACTCATCTCATCTTCAACAATATGAGTTTGATGCTTTATAATCTTAGCACCTGCGCTAAAGGCTGCATCTACCATAATTTTAGCTAGTTCTAAACTGCCATTATGATTAATGCCTATTTCAGGTACGACTAAGGGTGCTTTTTCTTCACTTATGATTATATTTTGTATTTTTATTTCTTTCATTTATTTTCCTCCTTAGSEQ ID NO: 6 >pG315, complete sequenceCTAAATTGTAAGCGTTAATATTTTGTTAAAATTCGCGTTAAATTTTTGTTAAATCAGCTCATTTTTTAACCAATAGGCCGAAATCGGCAAAATCCCTTATAAATCAAAAGAATAGACCGAGATAGGGTTGAGTGTTGTTCCAGTTTGGAACAAGAGTCCACTATTAAAGAACGTGGACTCCAACGTCAAAGGGCGAAAAACCGTCTATCAGGGCGATGGCCCACTACGTGAACCATCACCCTAATCAAGTTTTTTGGGGTCGAGGTGCCGTAAAGCACTAAATCGGAACCCTAAAGGGAGCCCCCGATTTAGAGCTTGACGGGGAAAGCCGGCGAACGTGGCGAGAAAGGAAGGGAAGAAAGCGAAAGGAGCGGGCGCTAGGGCGCTGGCAAGTGTAGCGGTCACGCTGCGCGTAACCACCACACCCGCCGCGCTTAATGCGCCGCTACAGGGCGCGTCCCATTCGCCATTCAGGCTGCGCAACTGTTGGGAAGGGCGATCGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGGATGTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGGTTTTCCCAGTCACGACGTTGTAAAACGACGGCCAGTGAGCGCGCGTAATACGACTCACTATAGGGCGAATTGGAGCTCCACCGCGGTGGCGGCCGCTCTAGAACTAGTGGATCCCTAGACTGCAATACAAACACCTGTTTCACAATTTGGCAGATCAGCCCAAAAAAGTACATTCTCTTCTTTTACAATACCTAGTTTTATCATTACTTGAACTAAAGGACTTCTCAAAGCAGTTTCACGATCAGTTATAGTTTCTGTCGATGTAAAAACTATAAATTTAATTTTTTCAGCTGGTATCGTGAAATATAAAGAGCTCGCTATACCAGCAACTGCATCAGGAAGCATATCTGTCATCATCAAAACTTCAAATGATATTTTTGATGGAATATCAACCATTGAAGGATAGTTTTGCATTATTAATGTATTAATGATACCGCCACCAGGGTGACCTTTGAAGAACAAATCATAACTATTGCCTAAATAATGTGGGCTCGATTCATTAATTGCATTATTAATGACATTAATTTGTTGTTTCGCATAATACTCTCTTTCATGGTTACCAGCCCATACAGTCGTACCTGTAAACACAAAGTTTGGTAAATTAGATGAATTATATTCATTTTGTAATTTTTGTTTGTCAAAATTAACAATCGATAAGAATAATTCTTGTTGTTTGCTATTGAATTTTTTGAAACCATCCCATTGCATTTGCTTTAAACTATCACCAATATAGTCTCGTAACTCATGTAATGATGGTTCTAAAGTTAAATAATCTTTTCTTAAAAAATGGTAGTTAGCTGGATATAGTTTTTGCCAGTTATAAACAGATGATGTTCCTGTATTTGAAGTGTCTTCATTGATACCATTAATGACATCCTCAAGATAATCTTTACCAATTTTTAAATTATCTGTTTTATTTAATGTATCTCTCCAGTTATATAAATTTACATATTCTGCTGAACCATCATCATATAAATCTATATTTGTTACCGTAACGTTATTAAACGAATTTAATTCTTTTAGTATTGGCACTAAATTATCAAATGAATGAGCAGTGTTAGAGCTAAGTTTAACATTCAATCTATGCTTTGTTTGTGCTTGCTTAACAATTTCTTGTACTAAGTCAGCTGGTGTATGGTTATTTATCAATGCAAACGATGTAATATTTAACTCTTTCATTTGCTCATCAGTCGGAACTATTCTCCCCCAAGCTATATATCTTTGTGCTGTAGGATTTTCTTCTTCCGATTTAATAATATCCATTAGCTGCTGAAGAGTTGGAAGAGATGCATGATCAACATAAACCTCTAAAGATGGAGCCACTACGTTTAATGTTACTTTTGTTATATATTTTTCACCTTTATTACTAACACCATTAAAATCAAAGCAGTACTTTTCATCGTCATCTAATCGTGGCGCCACTACAGATAATGATATTGACTCTTTATTTTGTTCTGTTAATAGTTGTTGCGTACCACAAGTTTGTACCCAAGAGTGTTTTGTAAAAGAGATGTTTGATTGATTAATTGGCTCTAAATTAACATACTCCTCATCAATAATAGTTTTATTAATATCATTTTTAATAATAGATTGTGTATTTTCTTCTGACATggtctgtttcctcCTCGAGGGGGGGCCCGGTACCCAGCTTTTGTTCCCTTTAGTGAGGGTTAATTGCGCGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAATTCCACACAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAACTCACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGGACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCAATACGGGATAATACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAAGGGAATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCCAC SEQ ID NO: 7CTCGAGgaggaaacagaccATG SEQ ID NO: 8 CTCGAGgaaagaggggacaaactagATGSEQ ID NO: 9 >pG345, complete sequenceCTAAATTGTAAGCGTTAATATTTTGTTAAAATTCGCGTTAAATTTTTGTTAAATCAGCTCATTTTTTAACCAATAGGCCGAAATCGGCAAAATCCCTTATAAATCAAAAGAATAGACCGAGATAGGGTTGAGTGTTGTTCCAGTTTGGAACAAGAGTCCACTATTAAAGAACGTGGACTCCAACGTCAAAGGGCGAAAAACCGTCTATCAGGGCGATGGCCCACTACGTGAACCATCACCCTAATCAAGTTTTTTGGGGTCGAGGTGCCGTAAAGCACTAAATCGGAACCCTAAAGGGAGCCCCCGATTTAGAGCTTGACGGGGAAAGCCGGCGAACGTGGCGAGAAAGGAAGGGAAGAAAGCGAAAGGAGCGGGCGCTAGGGCGCTGGCAAGTGTAGCGGTCACGCTGCGCGTAACCACCACACCCGCCGCGCTTAATGCGCCGCTACAGGGCGCGTCCCATTCGCCATTCAGGCTGCGCAACTGTTGGGAAGGGCGATCGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGGATGTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGGTTTTCCCAGTCACGACGTTGTAAAACGACGGCCAGTGAGCGCGCGTAATACGACTCACTATAGGGCGAATTGGAGCTCCACCGCGGTGGCGGCCGCTCTAGAACTAGTGGATCCCTAGACTGCAATACAAACACCTGTTTCACAATTTGGCAGATCAGCCCAAAAAAGTACATTCTCTTCTTTTACAATACCTAGTTTTATCATTACTTGAACTAAAGGACTTCTCAAAGCAGTTTCACGATCAGTTATAGTTTCTGTCGATGTAAAAACTATAAATTTAATTTTTTCAGCTGGTATCGTGAAATATAAAGAGCTCGCTATACCAGCAACTGCATCAGGAAGCATATCTGTCATCATCAAAACTTCAAATGATATTTTTGATGGAATATCAACCATTGAAGGATAGTTTTGCATTATTAATGTATTAATGATACCGCCACCAGGGTGACCTTTGAAGAACAAATCATAACTATTGCCTAAATAATGTGGGCTCGATTCATTAATTGCATTATTAATGACATTAATTTGTTGTTTCGCATAATACTCTCTTTCATGGTTACCAGCCCATACAGTCGTACCTGTAAACACAAAGTTTGGTAAATTAGATGAATTATATTCATTTTGTAATTTTTGTTTGTCAAAATTAACAATCGATAAGAATAATTCTTGTTGTTTGCTATTGAATTTTTTGAAACCATCCCATTGCATTTGCTTTAAACTATCACCAATATAGTCTCGTAACTCATGTAATGATGGTTCTAAAGTTAAATAATCTTTTCTTAAAAAATGGTAGTTAGCTGGATATAGTTTTTGCCAGTTATAAACAGATGATGTTCCTGTATTTGAAGTGTCTTCATTGATACCATTAATGACATCCTCAAGATAATCTTTACCAATTTTTAAATTATCTGTTTTATTTAATGTATCTCTCCAGTTATATAAATTTACATATTCTGCTGAACCATCATCATATAAATCTATATTTGTTACCGTAACGTTATTAAACGAATTTAATTCTTTTAGTATTGGCACTAAATTATCAAATGAATGAGCAGTGTTAGAGCTAAGTTTAACATTCAATCTATGCTTTGTTTGTGCTTGCTTAACAATTTCTTGTACTAAGTCAGCTGGTGTATGGTTATTTATCAATGCAAACGATGTAATATTTAACTCTTTCATTTGCTCATCAGTCGGAACTATTCTCCCCCAAGCTATATATCTTTGTGCTGTAGGATTTTCTTCTTCCGATTTAATAATATCCATTAGCTGCTGAAGAGTTGGAAGAGATGCATGATCAACATAAACCTCTAAAGATGGAGCCACTACGTTTAATGTTACTTTTGTTATATATTTTTCACCTTTATTACTAACACCATTAAAATCAAAGCAGTACTTTTCATCGTCATCTAATCGTGGCGCCACTACAGATAATGATATTGACTCTTTATTTTGTTCTGTTAATAGTTGTTGCGTACCACAAGTTTGTACCCAAGAGTGTTTTGTAAAAGAGATGTTTGATTGATTAATTGGCTCTAAATTAACATACTCCTCATCAATAATAGTTTTATTAATATCATTTTTAATAATAGATTGTGTATTTTCTTCTGACATctagtttgtcccctctttcCTCGAGGGGGGGCCCGGTACCCAGCTTTTGTTCCCTTTAGTGAGGGTTAATTGCGCGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAATTCCACACAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAACTCACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGGACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCAATACGGGATAATACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAAGGGAATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCCAC SEQ ID NO: 10CTTTattaaacctactATG SEQ ID NO: 11 CTTTcttcaacctactATGSEQ ID NO: 12 >pEC3′-(T7)GlmS-(T7)NagC-purA_(pG356)TCGCGCGTTTCGGTGATGACGGTGAAAACCTCTGACACATGCAGCTCCCGGAGACGGTCACAGCTTGTCTGTAAGCGGATGCCGGGAGCAGACAAGCCCGTCAGGGCGCGTCAGCGGGTGTTGGCGGGTGTCGGGGCTGGCTTAACTATGCGGCATCAGAGCAGATTGTACTGAGAGTGCACCATATATGCGGTGTGAAATACCGCACAGATGCGTAAGGAGAAAATACCGCATCAGGCGCCactagtGTTGAGGAAAACGATTGGCTGAACAAAAAACAGACTGATCGAGGTCATTTTTGAGTGCAAAAAGTGCTGTAACTCTGAAAAAGCGATGGTAGAATCCATTTTTAAGCAAACGGTGATTTTGAAAAATGGGTAACAACGTCGTCGTACTGGGCACCCAATGGGGTGACGAAGGTAAAGGTAAGATCGTCGATCTTCTGACTGAACGGGCTAAATATGTTGTACGCTACCAGGGCGGTCACAACGCAGGCCATACTCTCGTAATCAACGGTGAAAAAACCGTTCTCCATCTTATTCCATCAGGTATTCTCCGCGAGAATGTAACCAGCATCATCGGTAACGGTGTTGTGCTGTCTCCGGCCGCGCTGATGAAAGAGATGAAAGAACTGGAAGACCGTGGCATCCCCGTTCGTGAGCGTCTGCTGCTGTCTGAAGCATGTCCGCTGATCCTTGATTATCACGTTGCGCTGGATAACGCGCGTGAGAAAGCGCGTGGCGCGAAAGCGATCGGCACCACCGGTCGTGGTATCGGGCCTGCTTATGAAGATAAAGTAGCACGTCGCGGTCTGCGTGTTGGCGACCTTTTCGACAAAGAAACCTTCGCTGAAAAACTGAAAGAAGTGATGGAATATCACAACTTCCAGTTGGTTAACTACTACAAAGCTGAAGCGGTTGATTACCAGAAAGTTCTGGATGATACGATGGCTGTTGCCGACATCCTGACTTCTATGGTGGTTGACGTTTCTGACCTGCTCGACCAGGCGCGTCAGCGTGGCGATTTCGTCATGTTTGAAGGTGCGCAGGGTACGCTGCTGGATATCGACCACGGTACTTATCCGTACGTAACTTCTTCCAACACCACTGCTGGTGGCGTGGCGACCGGTTCCGGCCTGGGCCCGCGTTATGTTGATTACGTTCTGGGTATCCTCAAAGCTTACTCCACTCGTGTAGGTGCAGGTCCGTTCCCGACCGAACTGTTTGATGAAACTGGCGAGTTCCTCTGCAAGCAGGGTAACGAATTCGGCGCAACTACGGGGCGTCGTCGTCGTACCGGCTGGCTGGACACCGTTGCCGTTCGTCGTGCGGTACAGCTGAACTCCCTGTCTGGCTTCTGCCTGACTAAACTGGACGTTCTGGATGGCCTGAAAGAGGTTAAACTCTGCGTGGCTTACCGTATGCCGGATGGTCGCGAAGTGACTACCACTCCGCTGGCAGCTGACGACTGGAAAGGTGTAGAGCCGATTTACGAAACCATGCCGGGCTGGTCTGAATCCACCTTCGGCGTGAAAGATCGTAGCGGCCTGCCGCAGGCGGCGCTGAACTATATCAAGCGTATTGAAGAGCTGACTGGTGTGCCGATCGATATCATCTCTACCGGTCCGGATCGTACTGAAACCATGATTCTGCGCGACCCGTTCGACGCGTAATTCTGGTACGCCTGGCAGATATTTTGCCTGCCGGGCGAACAGTGTGATACATTGCTGTGTCGGGTAAGCCATTACGCTATCCGACACAGTGTTAAATCCTCGCTTTTTTCCTTCCCCagatctGGCGCCATTCGCCATTCAGGCTGCGCAACTGTTGGGAAGGGCGATCGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGGATGTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGGTTTTCCCAGTCACGACGTTGTAAAACGACGGCCAGTGCCAAGCTTACTGCTCACAAGAAAAAAGGCACGTCATCTGACGTGCCTTTTTTATTTGTACTACCCTGTACGATTACTGCAGGTCGACTTAATTTTCCAGCAAATGCTGGAGCAAAATACCGTTGAGCATGGCGCGTTTTACCAGCGCAAAAGCGCCGATTGCCGAGCGGTGATCCAGCTCAGAACGTACCACCGGCAGATTAGTGCGAAACGCCTTCAGCGCCTGGGTATTAATGCAGCTTTCAATAGCAGGGAGCAGCACTTTATCGGCTTCGGTGATTTCACCGGCAATAACAATTTTTTGCGGATTAAATAAGTTGATAGCAATGGCGATGGTTTTACCCAGATGACGACCGACATACTCAATTACTTCCGACGCCAGACTATCGCCTTTGTTCGCGGCTTTGCAGATAGTTTTGATGGTGCAGTCGTCCAGCGGCACGCGGCTCTGGTAGCCCTGCTTTAACAGATTCAACACCCGTTGTTCAATGGCAGCGTTGGCAGCGATAGTTTCCAGGCAGCCAAAGTTGCCGCAGTGGCAGCGTTCACCCAGCGGTTCGACCTGAATATGGCCAATTTCACCGACGTTGCCGTTGCGGCCAATAAAAATGCGCCCGTTAGAGATAATCCCGGCCCCGGTTCCGCGATGGACACGCACCAGAATGGAGTCTTCGCAATCCTGACTTGCACCGAAGTAGTGCTCCGCCAGCGCCAGACTACGGATATCGTGACCAACGAAACAGGTCACTTTAAAACGTTCTTCCAGAGCTTCTACCAGCCCCCAGTTTTCTACCTGAATATGCGGCATGTAATGAATTTTGCCGCTGTCCGGGTCAACAAGCCCTGGCAGGATCACCGAAATCGCGATCAGCTCGCGCAGTTTGCGCTGGTAGCTATCAATAAACTGAGCAATGGCATTCAACAGGGCATGTTCCAGCGTTTGCTGGGTACGTTCCGGCAGCGGGTAATGTTCTTCTGCCAGCACTTTGCTGCTGAGATCAAACAGAGTGATGGTGGCGTCATGACGACCAAGCCGTACGCCGATTGCGTGGAAATTGCGGGTTTCGGTGACGATGGAGATAGCGCGGCGGCCCCCGGTGGAGGCCTGCTGATCAACTTCTTTGATCAGCCCGCGTTCGATAAGCTGACGCGTAATTTTGGTTACGCTGGCGGGGGCAAGCTGGCTTTGCTCGGCAATCTGAATCCGCGAGATTGGCCCGTACTGGTCAATCAGGCGATAAACCGCCGCGCTGTTAAGCTGTTTTACGAGATCAACATTACCTATCTGAGCTTGTCCGCCTGGTGTCATATGTATATCTCCTTCTTgtcgacTCTAGATGCATGCTCGAGATTACTCAACCGTAACCGATTTTGCCAGGTTACGCGGCTGGTCAACGTCGGTGCCTTTGATCAGCGCGACATGGTAAGCCAGCAGCTGCAGCGGAACGGTGTAGAAGATCGGTGCAATCACCTCTTCCACATGCGGCATCTCGATGATGTGCATGTTATCGCTACTTACAAAACCCGCATCCTGATCGGCGAAGACATACAACTGACCGCCACGCGCGCGAACTTCTTCAATGTTGGATTTCAGTTTTTCCAGCAATTCGTTGTTCGGTGCAACAACAATAACCGGCATATCGGCATCAATTAGCGCCAGCGGACCGTGTTTCAGTTCGCCAGCAGCGTAGGCTTCAGCGTGAATGTAAGAGATCTCTTTCAACTTCAATGCGCCTTCCAGCGCGATTGGGTACTGATCGCCACGGCCCAGGAACAGCGCGTGATGTTTGTCAGAGAAATCTTCTGCCAGCGCTTCAATGCGTTTGTCCTGAGACAGCATCTGCTCAATACGGCTCGGCAGCGCCTGCAGACCATGCACGATGTCATGTTCAATGGAGGCATCCAGACCTTTCAGGCGAGACAGCTTCGCCACCAGCATCAACAGCACAGTTAACTGAGTGGTGAATGCTTTAGTGGATGCCACGCCGATTTCTGTACCCGCGTTGGTCATTAGCGCCAGATCGGATTCGCGCACCAGAGAAGAACCCGGAACGTTACAGATTGCCAGTGAACCAAGGTAACCCAGCTCTTTCGACAGACGCAGGCCAGCCAGGGTATCCGCGGTTTCGCCAGACTGTGACAAGGTGATCATCAGGCTGTTACGACGCACGGCAGATTTGCGATAGCGGAATTCAGAGGCGATTTCGACGTCGCACGGAATACCTGCTAGCGATTCAAACCAGTAGCGGGAAACCATACCGGAGTTATAAGAAGTACCACAGGCGAGGATCTGAATATGCTCAACCTTCGACAGCAGTTCGTCGGCGTTCGGTCCCAGCTCGCTTAAATCAACCTGACCGTGGCTGATGCGTCCGGTAAGGGTGTTTTTGATCGCGTTCGGCTGTTCGTAGATCTCTTTCTGCATGTAGTGACGGTAAATGCCTTTATCGCCCGCGTCATATTGCAGATTGGATTCGATATCCTGACGTTTTACTTCCGCGCCAGTTTTATCGAAGATGTTTACCGAACGGCGAGTGATTTCCGCAATATCGCCCTCTTCAAGGAAGATAAAGCGACGGGTCACCGGCAACAGCGCCAGCTGGTCAGAAGCGATAAAGTTTTCGCCCATCCCCAGGCCAATCACCAGCGGACTACCAGAACGTGCCGCCAGCAGGGTATCCGGGTGACGGGAGTCCATGATCACTGTACCGTACGCACCACGCAGCTGCGGGATAGCACGCAGAACGGCCTCACGCAGAGTCCCGCCTTGTTTCAGCTCCCAGTTCACCAGATGGGCAATCACTTCGGTGTCGGTTTCAGAAACGAAGGTATAGCCACGCGCTTTTAGCTCTTCACGCAGCGGTTCATGGTTTTCGATGATGCCGTTATGCACCACCACAATGTGTTCAGAAACATGCGGATGCGCATTCACTTCTGAAGGTTCACCGTGGGTCGCCCAGCGAGTGTGAGCAATACCAGTGCCGCCATGCAGAGGATGTTCTTCCGCTGCCTGTGCCAGCATCTGGACTTTACCGAGGCGACGCAGGCGGGTCATATGACCTTCTGCATCAACAACGGCCAGACCGGCAGAGTCATATCCGCGGTATTCCAGACGACGTAAACCTTCAAGAAGGATTTCTGCTACATCACGTTGCGCGATCGCGCCAACAATTCCACACATATGtatatctccttcttgaaTTCTAACAATTGATTGAATGTATGCAAATAAATGCATACACCATAGGTGTGGTTTAATTTGATGCCCTTTTTCAGGGCTGGAATGTGTAAGAGCGGGGTTATTTATGCTGTTGTTTTTTTGTTACTCGGGAAGGGCTTTACCTCTTCCGCATAAACGCTTCCATCAGCGTTTATAGTTAAAAAAATCTTTCGGAACTGGTTTTGCGCTTACCCCAACCAACAGGGGATTTGCTGCTTTCCATTGAGCCTGTTTCTCTGCGCGACGTTCGCGGCGGCGTGTTTGTGCATCCATCTGGATTCTCCTGTCAGTTAGCTTTGGTGGTGTGTGGCAGTTGTAGTCCTGAACGAAAACCCCCCGCGATTGGCACATTGGCAGCTAATCCGGAATCGCACTTACGGCCAATGCTTCGTTTCGTATCACACACCCCAAAGCCTTCTGCTTTGAATGCTGCCCTTCTTCAGGGCTTAATTTTTAAGAGCGTCACCTTCATGGTGGTCAGTGCGTCCTGCTGATGTGCTCAGTATCACCGCCAGTGGTATTTATGTCAACACCGCCAGAGATAATTTATCACCGCAGATGGTTATCTGTATGTTTTTTATATGAATTTATTTTTTGCAGGGGGGCATTGTTTGGTAGGTGAGAGATCAATTCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTCCGCTGCTAGCGGAGTGTATACTGGCTTACTATGTTGGCACTGATGAGGGTGTCAGTGAAGTGCTTCATGTGGCAGGAGAAAAAAGGCTGCACCGGTGCGTCAGCAGAATATGTGATACAGGATATATTCCGCTTCCTCGCTCACTGACTCGCTACGCTCGGTCGTTCGACTGCGGCGAGCGGAAATGGCTTACGAACGGGGCGGAGATTTCCTGGAAGATGCCAGGAAGATACTTAACAGGGAAGTGAGAGGGCCGCGGCAAAGCCGTTTTTCCATAGGCTCCGCCCCCCTGACAAGCATCACGAAATCTGACGCTCAAATCAGTGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGCGGCTCCCTCGTGCGCTCTCCTGTTCCTGCCTTTCGGTTTACCGGTGTCATTCCGCTGTTATGGCCGCGTTTGTCTCATTCCACGCCTGACACTCAGTTCCGGGTAGGCAGTTCGCTCCAAGCTGGACTGTATGCACGAACCCCCCGTTCAGTCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGAAAGACATGCAAAAGCACCACTGGCAGCAGCCACTGGTAATTGATTTAGAGGAGTTAGTCTTGAAGTCATGCGCCGGTTAAGGCTAAACTGAAAGGACAAGTTTTGGTGACTGCGCTCCTCCAAGCCAGTTACCTCGGTTCAAAGAGTTGGTAGCTCAGAGAACCTTCGAAAAACCGCCCTGCAAGGCGGTTTTTTCGTTTTCAGAGCAAGAGATTACGCGCAGACCAAAACGATCTCAAGAAGATCATCTTATTAATCAGATAAAATATTTCTAGGCggccgcGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCAATACGGGATAATACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAAGGGAATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCCACCTGACGTCTAAGAAACCATTATTATCATGACATTAACCTATAAAAATAGGCGTATCACGAGGCCCTTTCGTCSEQ ID NO: 13 >neuC_N-acetylglucosamine-6-phosphate-2-epimerase_GI_15193223_in_pG317MKKILFITGSRADYSKIKSLMYRVQNSSEFELYIFATGMHLSKNFGYTVKELYKNGFKNIYEFINYDKYYQTDKALATTIDGFSRYANELKPDLIVVHGDRIEPLAAAIVGALNNILVAHIEGGEISGTIDDSLRHAISKLAHIHLVNDEFAKRRLMQLGEDEKSIFIIGSPDLELLNDNKISLSEAKKYYDINYENYALLMFHPVTTEITSIKNQADNLVKALIQSNKNYIVIYPNNDLGFELILQSYEEFKNNPRFKLFPSLRFEYFITLLKNADFIIGNSSCILKEALYLKTAGILVGSRQNGRLGNENTLKVNANSDEILKAINTIHKKQDLFSAKLEILDSSKLFFEYLQSGDFFKLSTQKVFKDIKSEQ ID NO: 14 >neuB_sialic_acid_synthase_GI_15193222_in_pG317MKEIKIQNIIISEEKAPLVVPEIGINHNGSLELAKIMVDAAFSAGAKIIKHQTHIVEDEMSKAAKKVIPGNAKISIYEIMQKCALDYKDELALKEYTEKLGLVYLSTPFSRAGANRLEDMGVSAFKIGSGECNNYPLIKHIAAFKKPMIVSTGMNSIESIKPTVKILLDNEIPFVLMHTTNLYPTPHNLVRLNAMLELKKEFSCMVGLSDHTTDNLACLGAVVLGACVLERHFTDSMHRSGPDIVCSMDTKALKELIIQSEQMAIIRGNNESKKAAKQEQVTIDFAFASVVSIKDIKKGEVLSMDNIWVKRPGLGGISAAEFENILGKKALRDIENDAQLSYEDFASEQ ID NO: 15 >neuA_CMP-Neu5Ac_synthase_GI_15193224_in_pG317MSLAIIPARGGSKGIKNKNLVLLNNKPLIYYTIKAALNAKSISKVVVSSDSDEILNYAKSQNVDILKRPISLAQDDTTSDKVLLHALKFYKDYEDVVFLQPTSPLRTNIHINEAFNLYKNSNANALISVSECDNKILKAFVCNDCGDLAGICNDEYPFMPRQKLPKTYMSNGAIYILKIKEFLNNPSFLQSKTKHFLMDESSSLDIDCLEDLKKVEQIWKKSEQ ID NO: 16 >AAF42258 lacto-N-neotetraose biosynthesis glycosyltransferase LgtA [Neisseria meningitidis MC58].MPSEAFRRHRAYRENKLQPLVSVLICAYNVEKYFAQSLAAVVNQTWRNLDILIVDDGSTDGTLAIAQRFQEQDGRIRILAQPRNSGLIPSLNIGLDELAKSGGGGEYIARTDADDIAAPDWIEKIVGEMEKDRSIIAMGAWLEVLSEEKDGNRLARHHEHGKIWKKPTRHEDIADFFPFGNPIHNNTMIMRRSVIDGGLRYNTERDWAEDYQFWYDVSKLGRLAYYPEALVKYRLHANQVSSKYSIRQHEIAQGIQKTARNDFLQSMGFKTRFDSLEYRQIKAVAYELLEKHLPEEDFERARRFLYQCFKRTDTLPAGAWLDFAADGRMRRLFTLRQYFGILHRLLKNRSEQ ID NO: 17 >NP_207619 lipooligosaccharide 5G8 epitopebiosynthesis-associated protein Lex2B [Helicobacter pylori 26695].MRVFAISLNQKVCDTFGLVFRDTTTLLNSINATHHQAQIFDAIYSKTFEGGLHPLVKKHLHPYFITQNIKDMGITTNLISEVSKFYYALKYHAKFMSLGELGCYASHYSLWEKCIELNEAICILEDDITLKEDFKEGLDFLEKHIQELGYIRLMHLLYDASVKSEPLSHKNHEIQERVGIIKAYSEGVGTQGYVITPKIAKVFLKCSRKWVVPVDTIMDATFIHGVKNLVLQPFVIADDEQISTIARKEEPYSPKIALMRELHFKYLKYWQFV SEQ ID NO: 18[E.coli_WbgO_YP_003500090 putative glycosyltransferase WbgO [Escherichia coli O55:H7 str. CB9615].MIIDEAESAESTHPVVSVILPVNKKNPFLDEAINSILSQTFSSFEIIIVANCCTDDFYNELKHKVNDKIKLIRTNIAYLPYSLNKAIDLSNGEFIARMDSDDISHPDRFTKQVDFLKNNPYVDVVGTNAIFIDDKGREINKTKLPEENLDIVKNLPYKCCIVHPSVMFRKKVIASIGGYMFSNYSEDYELWNRLSLAKIKFQNLPEYLFYYRLHEGQSTAKKNLYMVMVNDLVIKMKCFFLTGNINYLFGGIRTIASFIYCKYIKSEQ ID NO: 19 >BAA35319 DNA-binding transcriptional dual regulatornagC [Escherichia coli str. K-12 substr. W3110].MTPGGQAQIGNVDLVKQLNSAAVYRLIDQYGPISRIQIAEQSQLAPASVTKITRQLIERGLIKEVDQQASTGGRRAISIVTETRNFHAIGVRLGRHDATITLFDLSSKVLAEEHYPLPERTQQTLEHALLNAIAQFIDSYQRKLRELIAISVILPGLVDPDSGKIHYMPHIQVENWGLVEALEERFKVTCFVGHDIRSLALAEHYFGASQDCEDSILVRVHRGTGAGIISNGRIFIGRNGNVGEIGHIQVEPLGERCHCGNFGCLETIAANAAIEQRVLNLLKQGYQSRVPLDDCTIKTICKAANKGDSLASEVIEYVGRHLGKTIAIAINLFNPQKIVIAGEITEADKVLLPAIESCINTQALKAFRTNLPVVRSELDHRSAIGAFALVKRAMLNGILLQHLLENSEQ ID NO: 20 >NP_418185 L-glutamine:D-fructose-6-phosphateaminotransferase glmS [Escherichia coli str. K-12 substr. MG1655].MCGIVGAIAQRDVAEILLEGLRRLEYRGYDSAGLAVVDAEGHMTRLRRLGKVQMLAQAAEEHPLHGGTGIAHTRWATHGEPSEVNAHPHVSEHIVVVHNGIIENHEPLREELKARGYTFVSETDTEVIAHLVNWELKQGGTLREAVLRAIPQLRGAYGTVIMDSRHPDTLLAARSGSPLVIGLGMGENFIASDQLALLPVTRRFIFLEEGDIAEITRRSVNIFDKTGAEVKRQDIESNLQYDAGDKGIYRHYMQKEIYEQPNAIKNTLTGRISHGQVDLSELGPNADELLSKVEHIQILACGTSYNSGMVSRYWFESLAGIPCDVEIASEFRYRKSAVRRNSLMITLSQSGETADTLAGLRLSKELGYLGSLAICNVPGSSLVRESDLALMTNAGTEIGVASTKAFTTQLTVLLMLVAKLSRLKGLDASIEHDIVHGLQALPSRIEQMLSQDKRIEALAEDFSDKHHALFLGRGDQYPIALEGALKLKEISYIHAEAYAAGELKHGPLALIDADMPVIVVAPNNELLEKLKSNIEEVRARGGQLYVFADQDAGFVSSDNMHIIEMPHVEEVIAPIFYTVPLQLLAYHVALIKGTDVDQPR NLAKSVTVESEQ ID NO: 21 >BAF92026 beta-galactoside alpha-2,6-sialyltransferase[Photobacterium sp. JT-ISH-224].MKNFLLLTLILLTACNNSEENTQSIIKNDINKTIIDEEYVNLEPINQSNISFTKHSWVQTCGTQQLLTEQNKESISLSVVAPRLDDDEKYCFDFNGVSNKGEKYITKVTLNVVAPSLEVYVDHASLPTLQQLMDIIKSEEENPTAQRYIAWGRIVPTDEQMKELNITSFALINNHTPADLVQEIVKQAQTKHRLNVKLSSNTAHSFDNLVPILKELNSFNNVTVTNIDLYDDGSAEYVNLYNWRDTLNKTDNLKIGKDYLEDVINGINEDTSNTGTSSVYNWQKLYPANYHFLRKDYLTLEPSLHELRDYIGDSLKQMQWDGFKKFNSKQQELFLSIVNFDKQKLQNEYNSSNLPNFVFTGTTVWAGNHEREYYAKQQINVINNAINESSPHYLGNSYDLFFKGHPGGGIINTLIMQNYPSMVDIPSKISFEVLMMTDMLPDAVAGIASSLYFTIPAEKIKFIVFTSTETITDRETALRSPLVQVMIKLGIVKEENVLFWADLPNCETGVCIAVProvided below is the DNA sequence in Genbank format of the newconfiguration of genes engineered at the Escherichia coli thyA locus instrains used to produce N-acetylglucosamine-containing oligosaccharides.

LOCUS E680_thyA::2.8RBS_lacZ 5877 bp DNA linear BCT 04 MAR. 2013DEFINITION Escherichia coli str. K-12 substr. MG1655, complete genome.ACCESSION NC_000913 VERSION NC_000913.2 GI:49175990 KEYWORDS . SOURCEEscherichia coli str. K-12 substr. MG1655 (unknown) ORGANISMEscherichia coli str. K-12 substr. MG1655Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales;Enterobacteriaceae; Escherichia. REFERENCE 1 (bases 1 to 4639675)AUTHORS Riley,M., Abe,T., Arnaud,M.B., Berlyn,M.K., Blattner,F.R.,Chaudhuri,R.R., Glasner,J.D., Horiuchi,T., Keseler,I.M., Kosuge,T.,Mori,H., Perna,N.T., Plunkett,G. III, Rudd,K.E., Serres,M.H.,Thomas,G.H., Thomson,N.R., Wishart,D. and Wanner,B.L. TITLEEscherichia coli K-12: a cooperatively developed annotationsnapshot--2005 JOURNAL Nucleic Acids Res. 34 (1), 1-9 (2006) PUBMED16397293 REMARK Publication Status: Online-Only REFERENCE2 (bases 1 to 4639675) AUTHORSBlattner,F.R., Plunkett,G. III, Bloch,C.A., Perna,N.T., Burland,V.,Riley,M., Collado-Vides,J., Glasner,J.D., Rode,C.K., Mayhew,G.F.,Gregor,J., Davis,N.W., Kirkpatrick,H.A., Goeden,M.A., Rose,D.J.,Mau,B. and Shao,Y. TITLEThe complete genome sequence of Escherichia coli K-12 JOURNALScience 277 (5331), 1453-1474 (1997) PUBMED 9278503 REFERENCE3 (bases 1 to 4639675) AUTHORSArnaud,M., Berlyn,M.K.B., Blattner,F.R., Galperin,M.Y.,Glasner,J.D., Horiuchi,T., Kosuge,T., Mori,H., Perna,N.T.,Plunkett,G. III, Riley,M., Rudd,K.E., Serres,M.H., Thomas,G.H. andWanner,B.L. TITLE Workshop on Annotation of Escherichia coli K-12JOURNAL Unpublished REMARKWoods Hole, Mass., on 14-18 Nov. 2003 (sequence corrections) REFERENCE4 (bases 1 to 4639675) AUTHORSGlasner,J.D., Perna,N.T., Plunkett,G. III, Anderson,B.D.,Bockhorst,J., Hu,J.C., Riley,M., Rudd,K.E. and Serres,M.H. TITLEASAP: Escherichia coli K-12 strain MG1655 version m56 JOURNALUnpublished REMARK ASAP download 10 June 2004 (annotation updates)REFERENCE 5 (bases 1 to 4639675) AUTHORSHayashi,K., Morooka,N., Mori,H. and Horiuchi,T. TITLEA more accurate sequence comparison between genomes of Escherichiacoli K12 W3110 and MG1655 strains JOURNAL Unpublished REMARKGenBank accessions AG613214 to AG613378 (sequence corrections) REFERENCE6 (bases 1 to 4639675) AUTHORS Perna, N. T. TITLEEscherichia coli K-12 MG1655 yqiK-rfaE intergenic region, genomicsequence correction JOURNAL Unpublished REMARKGenBank accession AY605712 (sequence corrections) REFERENCE7 (bases 1 to 4639675) AUTHORS Rudd,K.E. TITLEA manual approach to accurate translation start site annotation: anE. coli K-12 case study JOURNAL Unpublished REFERENCE8 (bases 1 to 4639675) CONSRTM NCBI Genome Project TITLEDirect Submission JOURNALSubmitted (04-MAR-2013) National Center for BiotechnologyInformation, NIH, Bethesda, MD 20894, USA REFERENCE9 (bases 1 to 4639675) AUTHORS Rudd,K.E. TITLE Direct Submission JOURNALSubmitted (06-FEB-2013) Department of Biochemistry and MolecularBiology, University of Miami Miller School of Medicine, 118GautierBldg., Miami, FL 33136, USA REMARK Sequence update by submitterREFERENCE 10 (bases 1 to 4639675) AUTHORS Rudd,K.E. TITLEDirect Submission JOURNALSubmitted (24-APR-2007) Department of Biochemistry and MolecularBiology, University of Miami Miller School of Medicine, 118Gautier Bldg., Miami, FL 33136, USA REMARKAnnotation update from ecogene.org as a multi-database collaborationREFERENCE 11 (bases 1 to 4639675) AUTHORS Plunkett,G. III. TITLEDirect Submission JOURNALSubmitted (07-FEB-2006) Laboratory of Genetics, University ofWisconsin, 425G Henry Mall, Madison, WI 53706-1580, USA REMARKProtein updates by submitter REFERENCE 12 (bases 1 to 4639675) AUTHORSPlunkett,G. III. TITLE Direct Submission JOURNALSubmitted (10-JUN-2004) Laboratory of Genetics, University ofWisconsin, 425G Henry Mall, Madison, WI 53706-1580, USA REMARKSequence update by submitter REFERENCE 13 (bases 1 to 4639675) AUTHORSPlunkett,G. III. TITLE Direct Submission JOURNALSubmitted (13-OCT-1998) Laboratory of Genetics, University ofWisconsin, 425G Henry Mall, Madison, WI 53706-1580, USA REFERENCE14 (bases 1 to 4639675) AUTHORS Blattner,F.R. and Plunkett,G. III. TITLEDirect Submission JOURNALSubmitted (02-SEP-1997) Laboratory of Genetics, University ofWisconsin, 425G Henry Mall, Madison, WI 53706-1580, USA REFERENCE15 (bases 1 to 4639675) AUTHORS Blattner,F.R. and Plunkett,G. III. TITLEDirect Submission JOURNALSubmitted (16-JAN-1997) Laboratory of Genetics, University ofWisconsin, 425G Henry Mall, Madison, WI 53706-1580, USA COMMENTPROVISIONAL REFSEQ: This record has not yet been subject to finalNCBI review. The reference sequence is identical to U00096.On Jun 24, 2004 this sequence version replaced gi:16127994.Current U00096 annotation updates are derived from EcoGeneecogene.org. Suggestions for updates can be sent to Dr.Kenneth Rudd (krudd@miami.edu). These updates are being generatedfrom a collaboration that also includes ASAP/ERIC,the Coli GeneticStock Center, EcoliHub, EcoCyc, RegulonDB and UniProtKB/Swiss- Prot.COMPLETENESS: full length. FEATURES Location/Qualifiers genecomplement(<1..245) /gene=″ppdA″ /locus_tag=″b2826″/gene_synonym=″ECK2822; JW2794″ /db_xref=″EcoGene:EG12081″/db_xref=″GeneID:945393″ CDS complement(<1..245) /gene=″ppdA″/locus_tag=″b2826″ /gene_synonym=″ECK2822; JW2794″/function=″putative enzyme; Not classified″/GO_component=″GO:0009289 - pilus″/GO_process=″GO:0009101 - glycoprotein biosynthetic process″/note=″prepilin peptidase dependent protein A″ /codon_start=1/transl_table=11 /product=″hypothetical protein″/protein_id=″NP 417303.1″ /db_xref=″GI:16130730″/db_xref=″ASAP:ABE-0009266″ /db_xref=″UniProtKB/Swiss-Prot:P33554″/db_xref=″EcoGene:EG12081″ /db_xref=″GeneID:945393″ (SEQ ID NO: 22)/translation=″MKTQRGYTLIETLVAMLILVMLSASGLYGWQYWQQSQRLWQTASQARDYLLYLREDANWHNRDHSISVIREGTLWCLVSSAAGANTCHGSSPLVFVPRWPEVEMSDLTPSLAFFGLRNTAWAGHIRFKNSTGEWWLVVSPWGRLRLCQQGETEGCL″ sourcejoin(<1..449,4852..+225877)/organism=″Escherichia coli str.K-12 substr. MG1655″/mol_type=″genomic DNA″ /strain=″K-12″ /sub_strain=″MG1655″/db_xref=″taxon:511145″ primer 346..366/note=cagtcagtcaggcgccTTCGGGAAGGCGTCTCGAAGA (SEQ ID NO: 23)/label=0268-THYA-R misc_feature complement(388..394)/feature_type=″Hairpin loop″ /label=Terminator primer 400..449/note=GGCGTCGGCTCTGGCAGGATGTTTCGTAATTAGATAGCCACCGGCGCTTTagGaaacctactATGACCATGATTACGGATTCAC (SEQ ID NO: 24)/label=″50bp thyA 3 prime homology″ primer 400..483/note=GGCGTCGGCTCTGGCAGGATGTTTCGTAATTAGATAGCCACCGGCGCTTTattaaacctactATGACCATGATTACGGATTCAC (SEQ ID NO: 25)/label=1389-thyAKANlacZ-R-2-8 primer 400..483/note=GGCGTCGGCTCTGGCAGGATGTTTCGTAATTAGATAGCCACCGGCGCTTTCttCaacctactATGACCATGATTACGGATTCAC (SEQ ID NO: 26)/label=1516-thyAKANlacZ-R-0-8 primer 400..483/note=GGCGTCGGCTCTGGCAGGATGTTTCGTAATTAGATAGCCACCGGCGCTTTagGaaacctactATGACCATGATTACGGATTCAC (SEQ ID NO: 27)/label=″1041-thyAKANlacZ-R (4-8)″ misc_feature complement(401..407)/feature_type=″Hairpin loop″ /label=Terminator primer 405..472/note=CGGCTCTGGCAGGATGTTTCGTAATTAGATAGCCACCGGCGCTTTaTTaaac ctactATGACCATGAT (SEQ ID NO: 28) /label=1394-2/8-F genecomplement(join(429..449,4852..4854)) /gene=″thyA″ CDScomplement(join(429..449,4852..4854)) /gene=″thyA″/note=″ECK2823:JW2795:b2827″ /codon_start=1 /transl_table=11/product=″thymidylate synthetase″ /protein_id=″BAE76896.1″/db_xref=″GI:85675643″ (SEQ ID NO: 43)/translation=″MKQYLELMQKVLDEGTQKNDRTGTGTLSIFGHQMRFNLQDGFPLVTTKRCHLRSIIHELLWFLQGDTNIAYLHENNVTIWDEWADENGDLGPVYGKQWRAWPTPDGRHIDQITTVLNQLKNDPDSRRIIVSAWNVGELDKMALAPCHAFFQFYVADGKLSCQLYQRSCDVFLGLPFNIASYALLVHMMAQQCDLEVGDFVWTGGDTHLYSNHMDQTHLQLSREPRPLPKLIIKRKPESIFDYRFEDFEIEGYDPHPGIKAPVAI″ RBS 450..461/label=″2.8 RBS″ source 450..3536 /organism=″Escherichia coli W3110″/mol_type=″genomic DNA″ /strain=″K-12″ /sub_strain=″W3110″/db_xref=″taxon:316407″/note=″synonym: Escherichia coli str. K12 substr. W3110″ misc_feature450..4851 /feature_type=Insertion/note=″originates from KanR-lacZRBS (E403)″ /label=Insert misc_feature449″450 /feature type=″RBS variation site″ /label=″C in 0/8″misc_feature 450..453 /feature_type=″RBS variation site″/label=″CTTC in 0/8″ misc_feature 451..452/feature_type=″RBS variation site″ /label=″GG in 4/8″ misc_feature451..452 /feature_type=″RBS variation site″ /label=″TT in 2/8″ CDS462..3536 /gene=″lacZ″ /note=″ECK0341:JW0335:b0344″ /codon_start=1/transl_table=11 /product=″beta-D-galactosidase″/protein_id=″BAE76126.1″ /db_xref=″GI:85674486″ (SEQ ID NO: 29)/translation=″MTMITDSLAVVLQRRDWENPGVTQLNRLAAHPPFASWRNSEEARTDRPSQQLRSLNGEWRFAWFPAPEAVPESWLECDLPEADTVVVPSNWQMHGYDAPIYTNVTYPITVNPPFVPTENPTGCYSLTFNVDESWLQEGQTRIIFDGVNSAFHLWCNGRWVGYGQDSRLPSEFDLSAFLRAGENRLAVMVLRWSDGSYLEDQDMWRMSGIFRDVSLLHKPTTQISDFHVATRFNDDFSRAVLEAEVQMCGELRDYLRVTVSLWQGETQVASGTAPFGGEIIDERGGYADRVTLRLNVENPKLWSAEIPNLYRAVVELHTADGTLIEAEACDVGFREVRIENGLLLLNGKPLLIRGVNRHEHHPLHGQVMDEQTMVQDILLMKQNNFNAVRCSHYPNHPLWYTLCDRYGLYVVDEANIETHGMVPMNRLTDDPRWLPAMSERVTRMVQRDRNHPSVIIWSLGNESGHGANHDALYRWIKSVDPSRPVQYEGGGADTTATDIICPMYARVDEDQPFPAVPKWSIKKWLSLPGETRPLILCEYAHAMGNSLGGFAKYWQAFRQYPRLQGGFVWDWVDQSLIKYDENGNPWSAYGGDFGDTPNDRQFCMNGLVFADRTPHPALTEAKHQQQFFQFRLSGQTIEVTSEYLFRHSDNELLHWMVALDGKPLASGEVPLDVAPQGKQLIELPELPQPESAGQLWLTVRVVQPNATAWSEAGHISAWQQWRLAENLSVTLPAASHAIPHLTTSEMDFCIELGNKRWQFNRQSGFLSQMWIGDKKQLLTPLRDQFTRAPLDNDIGVSEATRIDPNAWVERWKAAGHYQAEAALLQCTADTLADAVLITTAHAWQHQGKTLFISRKTYRIDGSGQMAITVDVEVASDTPHPARIGLNCQLAQVAERVNWLGLGPQENYPDRLTAACFDRWDLPLSDMYTPYVFPSENGLRCGTRELNYGPHQWRGDFQFNISRYSQQQLMETSHRHLLHAEEGTWLNIDGFHMGIGGDDSWSPSVSAEFQLSAGRYHYQLVWCQK″/label=″wild-type lacZ+CDS″ primer complement(1325..1345)/note=TTCAGACGTAGTGTGACGCGA /label=1042-thyAlacZcheck primer 2754..2776/note=TTTCTTTCACAGATGTGGATTGG /label=″1395-mid lacZ-F″ primercomplement(2779..2801) /note=CGGCGTCAGCAGTTGTTTTTTAT/label=″1396-mid lacZ-R″ mutation 2793/label=″C in MG1655 lacZ (silent change)″ scar complement(3549..3567)/label=″KD13 downstream scar sequence″ source 3549..4851/organism=″Template plasmid pKD13″ /mol_type=″genomic DNA″/db_xref=″taxon:170493″ primer 3549..3568 /label=″0339 Plw-P2b″repeat unit 3568..3579 /label=″FLP site″ misc_featurecomplement(3568..3601) /feature type=″FRT site″ /label=″34bp FRT site″note complement(3568..4789)/label=″excised region upon pCP20 introduction″ repeat unitcomplement(3590..3601) /label=″Flp site″ misc_featurecomplement(3602..3615) /feature type=″FRT site″ /note=″natural FRT site″/label=″upstream FRT site″ repeat_unit complement(3604..3615)/label=″Flp site″ misc_feature complement(3628..4422)/feature type=″CDS (KAN resistance)″ /note=″kanamycin resistance″/codon_start=1 /transl_table=11/product=″Tn5 neomycin phosphotransferase″ /protein_id=″AAL02037.1″/db_xref=″GI:15554336″ (SEQ ID NO: 30)/translation=″MIEQDGLHAGSPAAWVERLFGYDWAQQTIGCSDAAVFRLSAQGRPVLFVKTDLSGALNELQDEAARLSWLATTGVPCAAVLDVVTEAGRDWLLLGEVPGQDLLSSHLAPAEKVSIMADAMRRLHTLDPATCPFDHQAKHRIERARTRMEAGLVDQDDLDEEHQGLAPAELFARLKARMPDGEDLVVTHGDACLPNIMVENGRFSGFIDCGRLGVADRYQDIALATRDIAEELGGEWADRFLVLYGIAAPDSQRIAFYRLLDEFF″ primercomplement(3677..3696) /label=″0389 KD13 K4″ primer_bind 3791..3810/label=″common priming site kt″ primer 3791..3810/label=″0344 Wanner Kt primer″ mutation 3811/label=″A in wt (silent change)″ primer complement(4242..4261)/label=″0343 Wanner K2 primer″ primer_bind 4261..4280/label=″common priming site k2″ primer_bind 4352..4371/label=″common priming site k1″ primer 4352..4371/label=″0342 Wanner K1 primer″ repeat_unit 4790..4801 /label=″FLP site″scar complement(4790..4851) /label=″KD13 upstream scar″ misc_featurecomplement(4790..4823) /feature type=″FRT site″ /label=″34bp FRT site″repeat_unit complement(4812..4823) /label=″Flp site″ primercomplement(4832..4851) /label=″0338 P4w-P1b″ primercomplement(4832..4901)/note=TCTGGGCATATCGTCGCAGCCCACAGCAACACGTTTCCTGAGGAACCATGATTCCGGGGATCCGTCGACC (SEQ ID NO: 31) /label=1040-thyAKANlacZ-F Sitecomplement(4858..4863) /site_type=″binding site″ /label=″thyA RBS″ genecomplement(4861..5736) /gene=″lgt″ CDS complement(4861..5736)/gene=″lgt″ /note=″ECK2824:JW2796:b2828″ /codon_start=1 /transl_table=11/product=″phosphatidylglycerol-prolipoproteindiacylglyceryl transferase″ /protein_id=″BAE76897.1″/db_xref=″GI:85675644″ (SEQ ID NO: 32)/translation=″MTSSYLHFPEFDPVIFSIGPVALHWYGLMYLVGFIFAMWLATRRANRPGSGWTKNEVENLLYAGFLGVFLGGRIGYVLFYNFPQFMADPLYLFRVWDGGMSFHGGLIGVIVVMIIFARRTKRSFFQVSDFIAPLIPFGLGAGRLGNFINGELWGRVDPNFPFAMLFPGSRTEDILLLQTNPQWQSIFDTYGVLPRHPSQLYELLLEGVVLFIILNLYIRKPRPMGAVSGLFLIGYGAFRIIVEFFRQPDAQFTGAWVQYISMGQILSIPMIVAGVIMMVWAYRRSPQQHVS″ promoter complement(4957..4962) /label=″thyA WEAK -10″promoter complement(4978..4983) /label=″thyA -35″ primercomplement(5076..5099) /note=cagtcagtcaggcgccTCCTCAACCTGTATATTCGTAAAC (SEQ ID NO: 33) /label=0267-THYA-F Site complement(5739..5744)/site type=″binding site″ /label=″Igt RBS″ promotercomplement(5823..5828) /label=″Igt -10 (strong)″ ORIGIN (SEQ ID NO: 34)   1 GCAGCGGAAC TCACAAGGCA CCATAACGTC CCCTCCCTGA TAACGCTGAT ACTGTGGTCG  61 CGGTTATGCC AGTTGGCATC TTCACGTAAA TAGAGCAAAT AGTCCCGCGC CTGGCTGGCG 121 GTTTGCCATA GCCGTTGCGA CTGCTGCCAG TATTGCCAGC CATAGAGTCC ACTTGCGCTT 181 AGCATGACCA AAATCAGCAT CGCGACCAGC GTTTCAATCA GCGTATAACC ACGTTGTGTT 241 TTCATGCCGG CAGTATGGAG CGAGGAGAAA AAAAGACGAG GGCCAGTTTC TATTTCTTCG 301 GCGCATCTTC CGGACTATTT ACGCCGTTGC AGGACGTTGC AAAATTTCGG GAAGGCGTCT 361 CGAAGAATTT AACGGAGGGT AAAAAAACCG ACGCACACTG GCGTCGGCTC TGGCAGGATG 421 TTTCGTAATT AGATAGCCAC CGGCGCTTTa ttaaacctac tATGACCATG ATTACGGATT 481 CACTGGCCGT CGTTTTACAA CGTCGTGACT GGGAAAACCC TGGCGTTACC CAACTTAATC 541 GCCTTGCAGC ACATCCCCCT TTCGCCAGCT GGCGTAATAG CGAAGAGGCC CGCACCGATC 601 GCCCTTCCCA ACAGTTGCGC AGCCTGAATG GCGAATGGCG CTTTGCCTGG TTTCCGGCAC 661 CAGAAGCGGT GCCGGAAAGC TGGCTGGAGT GCGATCTTCC TGAGGCCGAT ACTGTCGTCG 721 TCCCCTCAAA CTGGCAGATG CACGGTTACG ATGCGCCCAT CTACACCAAC GTGACCTATC 781 CCATTACGGT CAATCCGCCG TTTGTTCCCA CGGAGAATCC GACGGGTTGT TACTCGCTCA 841 CATTTAATGT TGATGAAAGC TGGCTACAGG AAGGCCAGAC GCGAATTATT TTTGATGGCG 901 TTAACTCGGC GTTTCATCTG TGGTGCAACG GGCGCTGGGT CGGTTACGGC CAGGACAGTC 961 GTTTGCCGTC TGAATTTGAC CTGAGCGCAT TTTTACGCGC CGGAGAAAAC CGCCTCGCGG1021 TGATGGTGCT GCGCTGGAGT GACGGCAGTT ATCTGGAAGA TCAGGATATG TGGCGGATGA1081 GCGGCATTTT CCGTGACGTC TCGTTGCTGC ATAAACCGAC TACACAAATC AGCGATTTCC1141 ATGTTGCCAC TCGCTTTAAT GATGATTTCA GCCGCGCTGT ACTGGAGGCT GAAGTTCAGA1201 TGTGCGGCGA GTTGCGTGAC TACCTACGGG TAACAGTTTC TTTATGGCAG GGTGAAACGC1261 AGGTCGCCAG CGGCACCGCG CCTTTCGGCG GTGAAATTAT CGATGAGCGT GGTGGTTATG1321 CCGATCGCGT CACACTACGT CTGAACGTCG AAAACCCGAA ACTGTGGAGC GCCGAAATCC1381 CGAATCTCTA TCGTGCGGTG GTTGAACTGC ACACCGCCGA CGGCACGCTG ATTGAAGCAG1441 AAGCCTGCGA TGTCGGTTTC CGCGAGGTGC GGATTGAAAA TGGTCTGCTG CTGCTGAACG1501 GCAAGCCGTT GCTGATTCGA GGCGTTAACC GTCACGAGCA TCATCCTCTG CATGGTCAGG1561 TCATGGATGA GCAGACGATG GTGCAGGATA TCCTGCTGAT GAAGCAGAAC AACTTTAACG1621 CCGTGCGCTG TTCGCATTAT CCGAACCATC CGCTGTGGTA CACGCTGTGC GACCGCTACG1681 GCCTGTATGT GGTGGATGAA GCCAATATTG AAACCCACGG CATGGTGCCA ATGAATCGTC1741 TGACCGATGA TCCGCGCTGG CTACCGGCGA TGAGCGAACG CGTAACGCGA ATGGTGCAGC1801 GCGATCGTAA TCACCCGAGT GTGATCATCT GGTCGCTGGG GAATGAATCA GGCCACGGCG1861 CTAATCACGA CGCGCTGTAT CGCTGGATCA AATCTGTCGA TCCTTCCCGC CCGGTGCAGT1921 ATGAAGGCGG CGGAGCCGAC ACCACGGCCA CCGATATTAT TTGCCCGATG TACGCGCGCG1981 TGGATGAAGA CCAGCCCTTC CCGGCTGTGC CGAAATGGTC CATCAAAAAA TGGCTTTCGC2041 TACCTGGAGA GACGCGCCCG CTGATCCTTT GCGAATACGC CCACGCGATG GGTAACAGTC2101 TTGGCGGTTT CGCTAAATAC TGGCAGGCGT TTCGTCAGTA TCCCCGTTTA CAGGGCGGCT2161 TCGTCTGGGA CTGGGTGGAT CAGTCGCTGA TTAAATATGA TGAAAACGGC AACCCGTGGT2221 CGGCTTACGG CGGTGATTTT GGCGATACGC CGAACGATCG CCAGTTCTGT ATGAACGGTC2281 TGGTCTTTGC CGACCGCACG CCGCATCCAG CGCTGACGGA AGCAAAACAC CAGCAGCAGT2341 TTTTCCAGTT CCGTTTATCC GGGCAAACCA TCGAAGTGAC CAGCGAATAC CTGTTCCGTC2401 ATAGCGATAA CGAGCTCCTG CACTGGATGG TGGCGCTGGA TGGTAAGCCG CTGGCAAGCG2461 GTGAAGTGCC TCTGGATGTC GCTCCACAAG GTAAACAGTT GATTGAACTG CCTGAACTAC2521 CGCAGCCGGA GAGCGCCGGG CAACTCTGGC TCACAGTACG CGTAGTGCAA CCGAACGCGA2581 CCGCATGGTC AGAAGCCGGG CACATCAGCG CCTGGCAGCA GTGGCGTCTG GCGGAAAACC2641 TCAGTGTGAC GCTCCCCGCC GCGTCCCACG CCATCCCGCA TCTGACCACC AGCGAAATGG2701 ATTTTTGCAT CGAGCTGGGT AATAAGCGTT GGCAATTTAA CCGCCAGTCA GGCTTTCTTT2761 CACAGATGTG GATTGGCGAT AAAAAACAAC TGtTGACGCC GCTGCGCGAT CAGTTCACCC2821 GTGCACCGCT GGATAACGAC ATTGGCGTAA GTGAAGCGAC CCGCATTGAC CCTAACGCCT2881 GGGTCGAACG CTGGAAGGCG GCGGGCCATT ACCAGGCCGA AGCAGCGTTG TTGCAGTGCA2941 CGGCAGATAC ACTTGCTGAT GCGGTGCTGA TTACGACCGC TCACGCGTGG CAGCATCAGG3001 GGAAAACCTT ATTTATCAGC CGGAAAACCT ACCGGATTGA TGGTAGTGGT CAAATGGCGA3061 TTACCGTTGA TGTTGAAGTG GCGAGCGATA CACCGCATCC GGCGCGGATT GGCCTGAACT3121 GCCAGCTGGC GCAGGTAGCA GAGCGGGTAA ACTGGCTCGG ATTAGGGCCG CAAGAAAACT3181 ATCCCGACCG CCTTACTGCC GCCTGTTTTG ACCGCTGGGA TCTGCCATTG TCAGACATGT3241 ATACCCCGTA CGTCTTCCCG AGCGAAAACG GTCTGCGCTG CGGGACGCGC GAATTGAATT3301 ATGGCCCACA CCAGTGGCGC GGCGACTTCC AGTTCAACAT CAGCCGCTAC AGTCAACAGC3361 AACTGATGGA AACCAGCCAT CGCCATCTGC TGCACGCGGA AGAAGGCACA TGGCTGAATA3421 TCGACGGTTT CCATATGGGG ATTGGTGGCG ACGACTCCTG GAGCCCGTCA GTATCGGCGG3481 AATTCCAGCT GAGCGCCGGT CGCTACCATT ACCAGTTGGT CTGGTGTCAA AAATAAGCGG3541 CCGCtTTATG TAGGCTGGAG CTGCTTCGAA GTTCCTATAC TTTCTAGAGA ATAGGAACTT3601 CGGAATAGGA ACTTCAAGAT CCCCTTATTA GAAGAACTCG TCAAGAAGGC GATAGAAGGC3661 GATGCGCTGC GAATCGGGAG CGGCGATACC GTAAAGCACG AGGAAGCGGT CAGCCCATTC3721 GCCGCCAAGC TCTTCAGCAA TATCACGGGT AGCCAACGCT ATGTCCTGAT AGCGGTCCGC3781 CACACCCAGC CGGCCACAGT CGATGAATCC tGAAAAGCGG CCATTTTCCA CCATGATATT3841 CGGCAAGCAG GCATCGCCAT GGGTCACGAC GAGATCCTCG CCGTCGGGCA TGCGCGCCTT3901 GAGCCTGGCG AACAGTTCGG CTGGCGCGAG CCCCTGATGC TCTTCGTCCA GATCATCCTG3961 ATCGACAAGA CCGGCTTCCA TCCGAGTACG TGCTCGCTCG ATGCGATGTT TCGCTTGGTG4021 GTCGAATGGG CAGGTAGCCG GATCAAGCGT ATGCAGCCGC CGCATTGCAT CAGCCATGAT4081 GGATACTTTC TCGGCAGGAG CAAGGTGAGA TGACAGGAGA TCCTGCCCCG GCACTTCGCC4141 CAATAGCAGC CAGTCCCTTC CCGCTTCAGT GACAACGTCG AGCACAGCTG CGCAAGGAAC4201 GCCCGTCGTG GCCAGCCACG ATAGCCGCGC TGCCTCGTCC TGCAGTTCAT TCAGGGCACC4261 GGACAGGTCG GTCTTGACAA AAAGAACCGG GCGCCCCTGC GCTGACAGCC GGAACACGGC4321 GGCATCAGAG CAGCCGATTG TCTGTTGTGC CCAGTCATAG CCGAATAGCC TCTCCACCCA4381 AGCGGCCGGA GAACCTGCGT GCAATCCATC TTGTTCAATC ATGCGAAACG ATCCTCATCC4441 TGTCTCTTGA TCAGATCTTG ATCCCCTGCG CCATCAGATC CTTGGCGGCA AGAAAGCCAT4501  CCAGTTTACT TTGCAGGGCT TCCCAACCTT ACCAGAGGGC GCCCCAGCTG GCAATTCCGG4561 TTCGCTTGCT GTCCATAAAA CCGCCCAGTC TAGCTATCGC CATGTAAGCC CACTGCAAGC4621 TACCTGCTTT CTCTTTGCGC TTGCGTTTTC CCTTGTCCAG ATAGCCCAGT AGCTGACATT4681 CATCCGGGGT CAGCACCGTT TCTGCGGACT GGCTTTCTAC GTGTTCCGCT TCCTTTAGCA4741 GCCCTTGCGC CCTGAGTGCT TGCGGCAGCG TGAGCTTCAA AAGCGCTCTG AAGTTCCTAT4801 ACTTTCTAGA GAATAGGAAC TTCGAACTGC AGGTCGACGG ATCCCCGGAA TCATGGTTCC4861 TCAGGAAACG TGTTGCTGTG GGCTGCGACG ATATGCCCAG ACCATCATGA TCACACCCGC4921 GACAATCATC GGGATGGAAA GAATTTGCCC CATGCTGATG TACTGCACCC AGGCACCGGT4981 AAACTGCGCG TCGGGCTGGC GGAAAAACTC AACAATGATG CGAAACGCGC CGTAACCAAT5041 CAGGAACAAA CCTGAGACAG CTCCCATTGG GCGTGGTTTA CGAATATACA GGTTGAGGAT5101 AATAAACAGC ACCACACCTT CCAGCAGCAG CTCGTAAAGC TGTGATGGGT GGCGCGGCAG5161 CACACCGTAA GTGTCGAAAA TGGATTGCCA CTGCGGGTTG GTTTGCAGCA GCAAAATATC5221 TTCTGTACGG GAGCCAGGGA ACAGCATGGC AAACGGGAAG TTCGGGTCAA CGCGGCCCCA5281 CAATTCACCG TTAATAAAGT TGCCCAGACG CCCGGCACCA AGACCAAACG GAATGAGTGG5341 TGCGATAAAA TCAGAGACCT GGAAGAAGGA ACGTTTAGTA CGGCGGGCGA AGATAATCAT5401 CACCACGATA ACGCCAATCA GGCCGCCGTG GAAAGACATG CCGCCGTCCC AGACACGGAA5461 CAGATACAGC GGATCGGCCA TAAACTGCGG GAAATTGTAG AACAGAACAT AACCAATACG5521 TCCCCCGAGG AAGACGCCGA GGAAGCCCGC ATAGAGTAAG TTTTCAACTT CATTTTTGGT5581 CCAGCCGCTG CCCGGACGAT TCGCCCGTCG TGTTGCCAGC CACATTGCAA AAATGAAACC5641  CACCAGATAC ATCAGGCCGT ACCAGTGAAG CGCCACGGGT CCTATTGAGA AAATGACCGG5701  ATCAAACTCC GGAAAATGCA GATAGCTACT GGTCATCTGT CACCACAAGT TCTTGTTATT5761  TCGCTGAAAG AGAACAGCGA TTGAAATGCG CGCCGCAGGT TTCAGGCGCT CCAAAGGTGC5821  GAATAATAGC ACAAGGGGAC CTGGCTGGTT GCCGGATACC GTTAAAAGAT ATGTATA //Provided below is the DNA sequence in Genbank format of theconfiguration of genes at the Escherichia coli nan locus, and thedetails of the deletion endpoints found in engineered strains E1017 andE1018.

LOCUS W3110_nanRATEKyhcH_region 5861 bp DNA linear BCT  19FEB. 2009DEFINITION Escherichia coli str. K-12 substr. W3110 strain K-12.ACCESSION AC_000091 VERSION AC_000091.1 GI:89106884 KEYWORDS . SOURCEEscherichia coli str. K-12 substr. W3110 (unknown) ORGANISMEscherichia coli str. K-12 substr. W3110Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales;Enterobacteriaceae; Escherichia. REFERENCE 1 AUTHORSRiley,M., Abe,T., Arnaud,M.B., Berlyn,M.K., Blattner,F.R.,Chaudhuri,R.R., Glasner,J.D., Horiuchi,T., Keseler,I.M., Kosuge,T.,Mori,H., Perna,N.T., Plunkett,G. III, Rudd,K.E., Serres,M.H.,Thomas,G.H., Thomson,N.R., Wishart,D. and Wanner,B.L. TITLEEscherichia coli K-12: a cooperatively developed annotationsnapshot--2005 JOURNAL Nucleic Acids Res. 34 (1), 1-9 (2006) PUBMED16397293 REMARK Publication Status: Online-Only REFERENCE2 (bases 1 to 4646332) AUTHORSHayashi,K., Morooka,N., Yamamoto,Y., Fujita,K., Isono,K.,Choi, S., Ohtsubo,E., Baba,T., Wanner,B.L., Mori,H. and  Horiuchi,T.TITLE Highly accurate genome sequences of Escherichia coli K-12 strainsMG1655 and W3110 JOURNAL Mol. Syst. Biol. 2, 2006 (2006) PUBMED 16738553REFERENCE 3 AUTHORSYamamoto,Y., Aiba,H., Baba,T., Hayashi,K., Inada,T., Isono,K.,Itoh,T., Kimura,S., Kitagawa,M., Makino,K., Miki,T.,Mitsuhashi,N., Mizobuchi,K., Mori,H., Nakade,S., Nakamura,Y., Nashimoto,H., Oshima,T., Oyama,S., Saito,N., Sampei,G., Satoh,Y.,Sivasundaram,S., Tagami,H., Takahashi,H., Takeda,J., Takemoto,K.,Uehara,K., Wada,C., Yamagata,S. and Horiuchi,T. TITLEConstruction of a contiguous 874-kb sequence of the Escherichiacoli-K12 genome corresponding to 50.0-68.8 min on the linkage mapand analysis of its sequence features JOURNALDNA Res. 4 (2), 91-113 (1997) PUBMED 9205837 REFERENCE 4 AUTHORSItoh,T., Aiba,H., Baba,T., Hayashi,K., Inada,T., Isono,K.,Kasai,H., Kimura,S., Kitakawa,M., Kitagawa,M., Makino,K., Miki,T., Mizobuchi,K., Mori,H., Mori,T., Motomura,K., Nakade,S.,Nakamura,Y., Nashimoto,H., Nishio,Y., Oshima,T., Saito,N.,Sampei,G., Seki,Y., Sivasundaram,S., Tagami,H., Takeda,J.,Takemoto,K., Wada,C., Yamamoto,Y. and Horiuchi,T. TITLEA 460-kb DNA sequence of the Escherichia coli K-12 genomecorresponding to the 40.1-50.0 min region on the linkage map JOURNALDNA Res. 3 (6), 379-392 (1996) PUBMED 9097040 REFERENCE 5 AUTHORSAiba,H., Baba,T., Hayashi,K., Inada,T., Isono,K., Itoh,T.,Kasai,H., Kashimoto,K., Kimura,S., Kitakawa,M., Kitagawa,M.,Makino,K., Miki,T., Mizobuchi,K., Mori,H., Mori,T., Motomura,K., Nakade,S., Nakamura,Y., Nashimoto,H., Nishio,Y., Oshima,T.,Saito,N., Sampei,G., Seki,Y., Sivasundaram,S., Tagami,H.,Takeda,J., Takemoto,K., Takeuchi,Y., Wada,C., Yamamoto,Y. andHoriuchi,T.  TITLEA 570-kb DNA sequence of the Escherichia coli K-12 genomecorresponding to the 28.0-40.1 min region on the linkage map JOURNALDNA Res. 3 (6), 363-377 (1996) PUBMED 9097039 REFERENCE 6 AUTHORSArn,E.A. and Abelson,J.N. TITLE The 2′-5′RNA ligase of Escherichia coli Purification, cloning,and genomic disruption JOURNALJ. Biol. Chem. 271 (49), 31145-31153 (1996) PUBMED 8940112 REFERENCE 7AUTHORS Oshima,T., Aiba,H., Baba,T., Fujita,K., Hayashi,K., Honjo,A.,Ikemoto,K., Inada,T., Itoh,T., Kajihara,M., Kanai,K., Kashimoto,K., Kimura,S., Kitagawa,M., Makino,K., Masuda,S., Miki,T.,Mizobuchi,K., Mori,H., Motomura,K., Nakamura,Y., Nashimoto,H.,Nishio,Y., Saito,N., Sampei,G., Seki,Y., Tagami,H., Takemoto,K.,Wada,C., Yamamoto,Y., Yano,M. and Horiuchi,T. TITLEA 718-kb DNA sequence of the Escherichia coli K-12 genomecorresponding to the 12.7-28.0 min region on the linkage map JOURNALDNA Res. 3 (3), 137-155 (1996) PUBMED 8905232 REFERENCE 8 AUTHORSFujita,N., Mori,H., Yura,T. and Ishihama,A. TITLESystematic sequencing of the Escherichia coli genome: analysis ofthe 2.4-4.1 min (110,917-193,643 bp) region JOURNALNucleic Acids Res. 22 (9), 1637-1639 (1994) PUBMED 8202364 REFERENCE 9AUTHORS Janosi,L., Shimizu,I. and Kaji,A. TITLERibosome recycling factor (ribosome releasing factor) is essentialfor bacterial growth JOURNALProc. Natl. Acad. Sci. U.S.A. 91 (10), 4249-4253 (1994) PUBMED 8183897REFERENCE 10 AUTHORS Allikmets,R., Gerrard,B., Court,D. and Dean,M.TITLE Cloning and organization of the abc and mdl genes of Escherichiacoli: relationship to eukaryotic multidrug resistance JOURNALGene 136 (1-2), 231-236 (1993) PUBMED 7904973 REFERENCE 11 AUTHORSvan Heeswijk,W.C., Rabenberg,M., Westerhoff,H.V. and Kahn,D. TITLEThe genes of the glutamine synthetase adenylylation cascade are notregulated by nitrogen in Escherichia coli JOURNALMol. Microbiol. 9 (3), 443-457 (1993) PUBMED 8412694 REFERENCE 12AUTHORS Zhao,S., Sandt,C.H., Feulner,G., Vlazny,D.A., Gray,J.A. andHill,C.W. TITLERhs elements of Escherichia coli K-12: complex composites of sharedand unique components that have different evolutionary histories JOURNALJ. Bacteriol. 175 (10), 2799-2808 (1993) PUBMED 8387990 REFERENCE 13AUTHORS Yamada,M., Asaoka,S., Saier,M.H. Jr. and Yamada,Y. TITLECharacterization of the gcd gene from Escherichia coli K-12 W3110and regulation of its expression JOURNALJ. Bacteriol. 175 (2), 568-571 (1993) PUBMED 8419307 REFERENCE 14AUTHORS Cormack,R.S. and Mackie,G.A. TITLEStructural requirements for the processing of Escherichia coliribosomal RNA by RNase E in vitro JOURNALJ. Mol. Biol. 228 (4), 1078-1090 (1992) PUBMED 1474579 REFERENCE 15AUTHORS Gervais,F.G. and Drapeau,G.R. TITLEIdentification, cloning, and characterization of rcsF, a newregulator gene for exopolysaccharide synthesis that suppresses thedivision mutation ftsZ84 in Escherichia coli K-12 JOURNALJ. Bacteriol. 174 (24), 8016-8022 (1992) PUBMED 1459951 REFERENCE 16AUTHORS Yamanaka,K., Ogura,T., Niki,H. and Hiraga,S. TITLEIdentification and characterization of the smbA gene, a suppressorof the mukB null mutant of Escherichia coli JOURNALJ. Bacteriol. 174 (23), 7517-7526 (1992) PUBMED 1447125 REFERENCE 17AUTHORS Condon,C., Philips,J., Fu,Z.Y., Squires,C. and Squires,C.L.TITLE Comparison of the expression of the seven ribosomal RNA operons inEscherichia coli JOURNAL EMBO J. 11 (11), 4175-4185 (1992) PUBMED1396599 REFERENCE 18 AUTHORSArnqvist,A., Olsen,A., Pfeifer,J., Russell,D.G. and Normark,S. TITLEThe Crl protein activates cryptic genes for curli formation andfibronectin binding in Escherichia coli HB101 JOURNALMol. Microbiol. 6 (17), 2443-2452 (1992) PUBMED 1357528 REFERENCE 19AUTHORS Talarico,T.L., Ray,P.H., Dev,I.K., Merrill,B.M. and Dallas,W.S.TITLE Cloning, sequence analysis, and overexpression of Escherichia colifolK, the gene coding for7,8-dihydro-6-hydroxymethylpterin-pyrophosphokinase JOURNALJ. Bacteriol. 174 (18), 5971-5977 (1992) PUBMED 1325970 REFERENCE 20AUTHORS Li,S.J. and Cronan,J.E. Jr. TITLEThe genes encoding the two carboxyltransferase subunits ofEscherichia coli acetyl-CoA carboxylase JOURNALJ. Biol. Chem. 267 (24), 16841-16847 (1992) PUBMED 1355089 REFERENCE 21AUTHORS Yura,T., Mori,H., Nagai,H., Nagata,T., Ishihama,A., Fujita,N.,Isono,K., Mizobuchi,K. and Nakata,A. TITLESystematic sequencing of the Escherichia coli genome: analysis ofthe 0-2.4 min region JOURNALNucleic Acids Res. 20 (13), 3305-3308 (1992) PUBMED 1630901 REFERENCE 22AUTHORS Ghosh,S.K., Biswas,S.K., Paul,K. and Das,J. TITLENucleotide and deduced amino acid sequence of the recA gene ofVibrio cholerae JOURNAL Nucleic Acids Res. 20 (2), 372 (1992) PUBMED1741267 REFERENCE 23 AUTHORS Smallshaw,J.E. and Kelln,R.A. TITLECloning, nucleotide sequence and expression of the Escherichia coliK-12 pyrH gene encoding UMP kinase JOURNALGenetics (Life Sci. Adv.) 11, 59-65 (1992) REFERENCE 24 AUTHORSO′Neill,G.P., Grygorczyk,R., Adam,M. and Ford-Hutchinson,A.W. TITLEThe nucleotide sequence of a voltage-gated chloride channel fromthe electric organ of Torpedo californica JOURNALBiochim. Biophys. Acta 1129 (1), 131-134 (1991) PUBMED 1721838 REFERENCE25 AUTHORS Kajie,S., Ideta,R., Yamato,I. and Anraku,Y. TITLEMolecular cloning and DNA sequence of dniR, a gene affectinganaerobic expression of the Escherichia coli hexaheme nitrite reductaseJOURNAL FEMS Microbiol. Lett. 67 (2), 205-211 (1991) PUBMED 1663890REFERENCE 26 AUTHORSHershfield,M.S., Chaffee,S., Koro-Johnson,L., Mary,A., Smith,A.A.and Short,S.A. TITLEUse of site-directed mutagenesis to enhance the epitope- shieldingeffect of covalent modification of proteins with polyethylene glycolJOURNAL Proc. Natl. Acad. Sci. U.S.A. 88 (16), 7185-7189 (1991) PUBMED1714590 REFERENCE 27 AUTHORS Shimizu,I. and Kaji,A. TITLEIdentification of the promoter region of the ribosome- releasingfactor cistron (frr) JOURNAL J. Bacteriol. 173 (16), 5181-5187 (1991)PUBMED 1860827 REFERENCE 28 AUTHORSPoulsen,L.K., Refn,A., Molin,S. and Andersson,P. TITLEThe gef gene from Escherichia coli is regulated at the level oftranslation JOURNAL Mol. Microbiol. 5 (7), 1639-1648 (1991) PUBMED1943701 REFERENCE 29 AUTHORSPoulsen,L.K., Refn,A., Molin,S. and Andersson,P. TITLETopographic analysis of the toxic Gef protein from Escherichia coliJOURNAL Mol. Microbiol. 5 (7), 1627-1637 (1991) PUBMED 1943700 REFERENCE30 AUTHORS Kawamukai,M., Utsumi,R., Takeda,K., Higashi,A., Matsuda,H.,Choi,Y.L. and Komano,T. TITLENucleotide sequence and characterization of the sfsl gene: sfsl isInvolved in CRP*-dependent mal gene expression in Escherichia coliJOURNAL J. Bacteriol. 173 (8), 2644-2648 (1991) PUBMED 2013578 REFERENCE31 AUTHORS Hulton,C.S., Higgins,C.F. and Sharp,P.M. TITLEERIC sequences: a novel family of repetitive elements in thegenomes of Escherichia coli, Salmonella typhimurium and otherenterobacteria JOURNAL Mol. Microbiol. 5 (4), 825-834 (1991) PUBMED1713281 REFERENCE 32 AUTHORSMunro,A.W., Ritchie,G.Y., Lamb,A.J., Douglas,R.M. and Booth, I.R. TITLEThe cloning and DNA sequence of the gene for theglutathione-regulated potassium-efflux system KefC of Escherichia coliJOURNAL Mol. Microbiol. 5 (3), 607-616 (1991) PUBMED 2046548 REFERENCE33 AUTHORS Arigoni,F., Kaminski,P.A., Hennecke,H. and Elmerich,C. TITLENucleotide sequence of the fixABC region of Azorhizobiumcaulinodans ORS571: similarity of the fixB product with eukaryoticflavoproteins, characterization of fixX, and identification of nifWJOURNAL Mol. Gen. Genet. 225 (3), 514-520 (1991) PUBMED 1850088REFERENCE 34 AUTHORSMattick,J.S., Anderson,B.J., Cox,P.T., Dalrymple,B.P., Bills,M.M.,Hobbs,M. and Egerton,J.R. TITLEGene sequences and comparison of the fimbrial subunitsrepresentative of Bacteroides nodosus serotypes A to I: class I andclass II strains JOURNAL Mol. Microbiol. 5 (3), 561-573 (1991) PUBMED1675419 REFERENCE 35 AUTHORS Company,M., Arenas,J. and Abelson,J. TITLERequirement of the RNA helicase-like protein PRP22 for release ofmessenger RNA from spliceosomes JOURNALNature 349 (6309), 487-493 (1991) PUBMED 1992352 REFERENCE 36 AUTHORSUmeda,M. and Ohtsubo,E. TITLEFour types of IS1 with differences in nucleotide sequence reside inthe Escherichia coli K-12 chromosome JOURNAL Gene 98 (1), 1-5 (1991)PUBMED 1849492 REFERENCE 37 AUTHORS Hirvas,L., Koski,P. and Vaara,M.TITLE The ompH gene of Yersinia enterocolitica: cloning, sequencing,expression, and comparison with known enterobacterial ompH sequencesJOURNAL J. Bacteriol. 173 (3), 1223-1229 (1991) PUBMED 1991717 REFERENCE38 AUTHORS Bouvier,J. and Stragier,P. TITLENucleotide sequence of the lsp-dapB interval in Escherichia coli JOURNALNucleic Acids Res. 19 (1), 180 (1991) PUBMED 2011499 REFERENCE 39AUTHORS Dicker,I.B. and Seetharam,S. TITLECloning and nucleotide sequence of the firA gene and thefirA200(Ts) allele from Escherichia coli JOURNALJ. Bacteriol. 173 (1), 334-344 (1991) PUBMED 1987124 REFERENCE 40AUTHORS Grimm,B., Bull,A. and Breu,V. TITLEStructural genes of glutamate 1-semialdehyde aminotransferase forporphyrin synthesis in a cyanobacterium and Escherichia coli JOURNALMol. Gen. Genet. 225 (1), 1-10 (1991) PUBMED 1900346 REFERENCE 41AUTHORS Allen,B.L., Gerlach,G.F. and Clegg,S. TITLENucleotide sequence and functions of mrk determinants necessary forexpression of type 3 fimbriae in Klebsiella pneumoniae JOURNALJ. Bacteriol. 173 (2), 916-920 (1991) PUBMED 1670938 REFERENCE 42AUTHORS Chen,H., Lawrence,C.B., Bryan,S.K. and Moses,R.E. TITLEAphidicolin inhibits DNA polymerase II of Escherichia coli, analpha-like DNA polymerase JOURNALNucleic Acids Res. 18 (23), 7185-7186 (1990) PUBMED 2124684 REFERENCE 43AUTHORS Mallonee,D.H., White,W.B. and Hylemon,P.B. TITLECloning and sequencing of a bile acid-inducible operon fromEubacterium sp. strain VPI 12708 JOURNALJ. Bacteriol. 172 (12), 7011-7019 (1990) PUBMED 2254270 REFERENCE 44AUTHORS Young,C., Collins-Emerson,J.M., Terzaghi,E.A. and Scott,D.B.TITLE Nucleotide sequence of Rhizobium loti nodl JOURNALNucleic Acids Res. 18 (22), 6691 (1990) PUBMED 2251131 REFERENCE 45AUTHORS Chen,H., Sun,Y., Stark,T., Beattie,W. and Moses,R.E. TITLENucleotide sequence and deletion analysis of the polB gene ofEscherichia coli JOURNAL DNA Cell Biol. 9 (9), 631-635 (1990) PUBMED2261080 REFERENCE 46 AUTHORSEriani,G., Delarue,M., Poch,O., Gangloff,J. and Moras,D. TITLEPartition of tRNA synthetases into two classes based on mutuallyexclusive sets of sequence motifs JOURNALNature 347 (6289), 203-206 (1990) PUBMED 2203971 REFERENCE 47 AUTHORSShowalter,R.E. and Silverman,M.R. TITLENucleotide sequence of a gene, hpt, for hypoxanthinephosphoribosyltransferase from Vibrio harveyi JOURNALNucleic Acids Res. 18 (15), 4621 (1990) PUBMED 2388850 REFERENCE 48AUTHORS Martin-Verstraete,I., Debarbouille,M., Klier,A. and Rapoport, G.TITLE Levanase operon of Bacillus subtilis includes a fructose- specificphosphotransferase system regulating the expression of the operonJOURNAL J. Mol. Biol. 214 (3), 657-671 (1990) PUBMED 2117666 REFERENCE49 AUTHORS Henrich,B., Monnerjahn,U. and Plapp,R. TITLEPeptidase D gene (pepD) of Escherichia coli K-12: nucleotidesequence, transcript mapping, and comparison with other peptidase genesJOURNAL J. Bacteriol. 172 (8), 4641-4651 (1990) PUBMED 1695895 REFERENCE50 AUTHORS Nunn,D., Bergman,S. and Lory,S. TITLEProducts of three accessory genes, pilB, pilC, and pilD, arerequired for biogenesis of Pseudomonas aeruginosa pili JOURNALJ. Bacteriol. 172 (6), 2911-2919 (1990) PUBMED 1971619 REFERENCE 51AUTHORS Rosenthal,E.R. and Calvo,J.M. TITLEThe nucleotide sequence of leuC from Salmonella typhimurium JOURNALNucleic Acids Res. 18 (10), 3072 (1990) PUBMED 2190189 REFERENCE 52AUTHORS Kang,P.J. and Craig,E.A. TITLEIdentification and characterization of a new Escherichia coli genethat is a dosage-dependent suppressor of a dnaK deletion mutationJOURNAL J. Bacteriol. 172 (4), 2055-2064 (1990) PUBMED 2180916 REFERENCE53 AUTHORS Wurgler,S.M. and Richardson,C.C. TITLEStructure and regulation of the gene for dGTP triphosphohydrolasefrom Escherichia coli JOURNALProc. Natl. Acad. Sci. U.S.A. 87 (7), 2740-2744 (1990) PUBMED 2157212REFERENCE 54 AUTHORS Schaaff,I., Hohmann,S. and Zimmermann,F.K. TITLEMolecular analysis of the structural gene for yeast transaldolaseJOURNAL Eur. J. Biochem. 188 (3), 597-603 (1990) PUBMED 2185015REFERENCE 55 AUTHORS Ricca,E. and Calvo,J.M. TITLEThe nucleotide sequence of leuA from Salmonella typhimurium JOURNALNucleic Acids Res. 18 (5), 1290 (1990) PUBMED 2181403 REFERENCE 56AUTHORS Honore,N. and Cole,S.T. TITLENucleotide sequence of the aroP gene encoding the general aromaticamino acid transport protein of Escherichia coli K-12: homologywith yeast transport proteins JOURNALNucleic Acids Res. 18 (3), 653 (1990) PUBMED 2408019 REFERENCE 57AUTHORS Angerer,A., Gaisser,S. and Braun,V. TITLENucleotide sequences of the sfuA, sfuB, and sfuC genes of Serratiamarcescens suggest a periplasmic-binding-protein-dependent irontransport mechanism JOURNAL J. Bacteriol. 172 (2), 572-578 (1990) PUBMED2404942 REFERENCE 58 AUTHORSSurin,B.P., Watson,J.M., Hamilton,W.D., Economou,A. and Downie, J.A.TITLE Molecular characterization of the nodulation gene, nodT, from twobiovars of Rhizobium leguminosarum JOURNALMol. Microbiol. 4 (2), 245-252 (1990) PUBMED 2338917 REFERENCE 59AUTHORS Zhou,Z. and Syvanen,M. TITLEIdentification and sequence of the drpA gene from Escherichia coliJOURNAL J. Bacteriol. 172 (1), 281-286 (1990) PUBMED 1688424 REFERENCE60 AUTHORS Roncero,M.I., Jepsen,L.P., Stroman,P. and van Heeswijck,R.TITLE Characterization of a leuA gene and an ARS element from Mucorcircinelloides JOURNAL Gene 84 (2), 335-343 (1989) PUBMED 2693214REFERENCE 61 AUTHORS Ichikawa,S. and Kaji,A. TITLEMolecular cloning and expression of ribosome releasing factor JOURNALJ. Biol. Chem. 264 (33), 20054-20059 (1989) PUBMED 2684966 REFERENCE 62AUTHORS Minami-Ishii,N., Taketani,S., Osumi,T. and Hashimoto,T. TITLEMolecular cloning and sequence analysis of the cDNA for ratmitochondrial enoyl-CoA hydratase. Structural and evolutionaryrelationships linked to the bifunctional enzyme of the peroxisomalbeta-oxidation system JOURNAL Eur. J. Biochem. 185 (1), 73-78 (1989)PUBMED 2806264 REFERENCE 63 AUTHORSMatsubara,Y., Indo,Y., Naito,E., Ozasa,H., Glassberg,R.,Vockley,J., Ikeda,Y., Kraus,J. and Tanaka,K. TITLEMolecular cloning and nucleotide sequence of cDNAs encoding theprecursors of rat long chain acyl-coenzyme A, short chainacyl-coenzyme A, and isovaleryl-coenzyme A dehydrogenases. Sequencehomology of four enzymes of the acyl-CoA dehydrogenase family JOURNALJ. Biol. Chem. 264 (27), 16321-16331 (1989) PUBMED 2777793 REFERENCE 64AUTHORS Roa,B.B., Connolly,D.M. and Winkler,M.E. TITLEOverlap between pdxA and ksgA in the complex pdxA-ksgA-apaG- apaHoperon of Escherichia coli K-12 JOURNALJ. Bacteriol. 171 (9), 4767-4777 (1989) PUBMED 2670894 REFERENCE 65AUTHORS Lindquist,S., Galleni,M., Lindberg,F. and Normark,S. TITLESignalling proteins in enterobacterial AmpC beta-lactamase regulationJOURNAL Mol. Microbiol. 3 (8), 1091-1102 (1989) PUBMED 2691840 REFERENCE66 AUTHORS Xie,Q.W., Tabor,C.W. and Tabor,H. TITLESpermidine biosynthesis in Escherichia coli: promoter andtermination regions of the speED operon JOURNALJ. Bacteriol. 171 (8), 4457-4465 (1989) PUBMED 2666401 REFERENCE 67AUTHORS Sato,S., Nakada,Y. and Shiratsuchi,A. TITLEIS421, a new insertion sequence in Escherichia coli JOURNALFEBS Lett. 249 (1), 21-26 (1989) PUBMED 2542093 REFERENCE 68 AUTHORSLiu,J.D. and Parkinson,J.S. TITLEGenetics and sequence analysis of the pcnB locus, an Escherichiacoli gene involved in plasmid copy number control JOURNALJ. Bacteriol. 171 (3), 1254-1261 (1989) PUBMED 2537812 REFERENCE 69AUTHORS Henrich,B., Schroeder,U., Frank,R.W. and Plapp,R. TITLEAccurate mapping of the Escherichia coli pepD gene by sequenceanalysis of its 5′ flanking region JOURNALMol. Gen. Genet. 215 (3), 369-373 (1989) PUBMED 2651887 REFERENCE 70AUTHORS Lipinska,B., Sharma,S. and Georgopoulos,C. TITLESequence analysis and regulation of the htrA gene of Escherichiacoli: a sigma 32-independent mechanism of heat-inducible transcriptionJOURNAL Nucleic Acids Res. 16 (21), 10053-10067 (1988) PUBMED 3057437REFERENCE 71 AUTHORS Sung,Y.C. and Fuchs,J.A. TITLECharacterization of the cyn operon in Escherichia coli K12 JOURNALJ. Biol. Chem. 263 (29), 14769-14775 (1988) PUBMED 3049588 REFERENCE 72AUTHORS Lozoya,E., Hoffmann,H., Douglas,C., Schulz,W., Scheel,D. andHahlbrock,K. TITLEPrimary structures and catalytic properties of isoenzymes encodedby the two 4-coumarate: CoA ligase genes in parsley JOURNALEur. J. Biochem. 176 (3), 661-667 (1988) PUBMED 3169018 REFERENCE 73AUTHORS Andrews,S.C. and Guest,J.R. TITLENucleotide sequence of the gene encoding the GMP reductase ofEscherichia coli K12 JOURNAL Biochem. J. 255 (1), 35-43 (1988) PUBMED2904262 REFERENCE 74 AUTHORSJaiswal,A.K., McBride,O.W., Adesnik,M. and Nebert,D.W. TITLEHuman dioxin-inducible cytosolic NAD(P)H:menadione oxidoreductase.cDNA sequence and localization of gene to chromosome 16 JOURNALJ. Biol. Chem. 263 (27), 13572-13578 (1988) PUBMED 2843525 REFERENCE 75AUTHORS Karpel,R., Olami,Y., Taglicht,D., Schuldiner,S. and Padan,E.TITLE Sequencing of the gene ant which affects the Na+/H+ antiporteractivity in Escherichia coli JOURNALJ. Biol. Chem. 263 (21), 10408-10414 (1988) PUBMED 2839489 REFERENCE 76AUTHORS Mellano,M.A. and Cooksey,D.A. TITLENucleotide sequence and organization of copper resistance genesfrom Pseudomonas syringae pv. tomato JOURNALJ. Bacteriol. 170 (6), 2879-2883 (1988) PUBMED 3372485 REFERENCE 77AUTHORS Coleman,J. and Raetz,C.R. TITLEFirst committed step of lipid A biosynthesis in Escherichia coli:sequence of the 1pxA gene JOURNALJ. Bacteriol. 170 (3), 1268-1274 (1988) PUBMED 3277952 REFERENCE 78AUTHORS Gebhard,W., Schreitmuller,T., Hochstrasser,K. and Wachter,E.TITLE Complementary DNA and derived amino acid sequence of the precursorof one of the three protein components of the inter-alpha- trypsinInhibitor complex JOURNAL FEBS Lett. 229 (1), 63-67 (1988) PUBMED2450046 REFERENCE 79 AUTHORS Tomasiewicz,H.G. and McHenry,C.S. TITLESequence analysis of the Escherichia coli dnaE gene JOURNALJ. Bacteriol. 169 (12), 5735-5744 (1987) PUBMED 3316192 REFERENCE 80AUTHORS Crowell,D.N., Reznikoff,W.S. and Raetz,C.R. TITLENucleotide sequence of the Escherichia coli gene for lipid Adisaccharide synthase JOURNAL J. Bacteriol. 169 (12), 5727-5734 (1987)PUBMED 2824445 REFERENCE 81 AUTHORS Tabor,C.W. and Tabor,H. TITLEThe speEspeD operon of Escherichia coli. Formation and processingof a proenzyme form of S-adenosylmethionine decarboxylase JOURNALJ. Biol. Chem. 262 (33), 16037-16040 (1987) PUBMED 3316212 REFERENCE 82AUTHORS Nonet,M.L., Marvel,C.C. and Tolan,D.R. TITLEThe hisT-purF region of the Escherichia coli K-12 chromosome.Identification of additional genes of the hisT and purF operons JOURNALJ. Biol. Chem. 262 (25), 12209-12217 (1987) PUBMED 3040734 REFERENCE 83AUTHORS Coulton,J.W., Mason,P. and Allatt,D.D. TITLEfhuC and fhuD genes for iron (III)-ferrichrome transport intoEscherichia coli K-12 JOURNAL J. Bacteriol. 169 (8), 3844-3849 (1987)PUBMED 3301821 REFERENCE 84 AUTHORSHoriuchi,T., Nagasawa,T., Takano,K. and Sekiguchi,M. TITLEA newly discovered tRNA(lAsp) gene (aspV) of Escherichia coli K12JOURNAL Mol. Gen. Genet. 206 (2), 356-357 (1987) PUBMED 3295485REFERENCE 85 AUTHORSBen-Bassat,A., Bauer,K., Chang,S.Y., Myambo,K., Boosman,A. and Chang, S.TITLE Processing of the initiation methionine from proteins: propertiesof the Escherichia coli methionine aminopeptidase and its gene structureJOURNAL J. Bacteriol. 169 (2), 751-757 (1987) PUBMED 3027045 REFERENCE86 AUTHORS Gronger,P., Manian,S.S., Reilander,H., O′Connell,M.,Priefer,U.B. and Puhler,A. TITLEOrganization and partial sequence of a DNA region of the Rhizobiumleguminosarum symbiotic plasmid pRL6JI containing the genes fixABC,nifA, nifB and a novel open reading frame JOURNALNucleic Acids Res. 15 (1), 31-49 (1987) PUBMED 3029674 REFERENCE 87AUTHORS Richardson,K.K., Richardson,F.C., Crosby,R.M., Swenberg,J.A. andSkopek,T.R. TITLEDNA base changes and alkylation following in vivo exposure ofEscherichia coli to N-methyl-N-nitrosourea or N-ethyl-N- nitrosoureaJOURNAL Proc. Natl. Acad. Sci. U.S.A. 84 (2), 344-348 (1987) PUBMED3540961 REFERENCE 88 AUTHORS Chye,M.L. and Pittard,J. TITLETranscription control of the aroP gene in Escherichia coli K- 12:analysis of operator mutants JOURNALJ. Bacteriol. 169 (1), 386-393 (1987) PUBMED 3025182 REFERENCE 89AUTHORS Blanchin-Roland, S., Blanquet,S., Schmitter,J.M. and Fayat,G.TITLE The gene for Escherichia coli diadenosine tetraphosphatase islocated immediately clockwise to folA and forms an operon with ksgAJOURNAL Mol. Gen. Genet. 205 (3), 515-522 (1986) PUBMED 3031429REFERENCE 90 AUTHORSTakano,K., Nakabeppu,Y., Maki,H., Horiuchi,T. and Sekiguchi,M. TITLEStructure and function of dnaQ and mutD mutators of Escherichia coliJOURNAL Mol. Gen. Genet. 205 (1), 9-13 (1986) PUBMED 3540531 REFERENCE91 AUTHORS Mackie,G.A. TITLEStructure of the DNA distal to the gene for ribosomal protein S20in Escherichia coli K12: presence of a strong terminator and an IS1element JOURNAL Nucleic Acids Res. 14 (17), 6965-6981 (1986) PUBMED2429258 REFERENCE 92 AUTHORS Koster,W. and Braun,V. TITLEIron hydroxamate transport of Escherichia coli: nucleotide sequenceof the fhuB gene and identification of the protein JOURNALMol. Gen. Genet. 204 (3), 435-442 (1986) PUBMED 3020380 REFERENCE 93AUTHORS Breton,R., Sanfacon,H., Papayannopoulos,I., Biemann,K. andLapointe, J. TITLEGlutamyl-tRNA synthetase of Escherichia coli. Isolation and primarystructure of the gltX gene and homology with other aminoacyl- tRNAsynthetases JOURNAL J. Biol. Chem. 261 (23), 10610-10617 (1986) PUBMED3015933 REFERENCE 94 AUTHORS Birnbaum,M.J., Haspel,H.C. and Rosen,O.M.TITLE Cloning and characterization of a cDNA encoding the rat brainglucose-transporter protein JOURNALProc. Natl. Acad. Sol. U.S.A. 83 (16), 5784-5788 (1986) PUBMED 3016720REFERENCE 95 AUTHORS Cox,E.C. and Horner,D.L. TITLEDNA sequence and coding properties of mutD(dnaQ) a dominantEscherichia coli mutator gene JOURNALJ. Mol. Biol. 190 (1), 113-117 (1986) PUBMED 3023634 REFERENCE 96AUTHORS Ohki,M., Tamura,F., Nishimura,S. and Uchida,H. TITLENucleotide sequence of the Escherichia coli dnaJ gene andpurification of the gene product JOURNALJ. Biol. Chem. 261 (4), 1778-1781 (1986) PUBMED 3003084 REFERENCE 97AUTHORS Coulton,J.W., Mason,P., Cameron,D.R., Carmel,G., Jean,R. andRode, H.N. TITLEProtein fusions of beta-galactosidase to the ferrichrome-ironreceptor of Escherichia coli K-12 JOURNALJ. Bacteriol. 165 (1), 181-192 (1986) PUBMED 3079747 REFERENCE 98AUTHORS Lee,N., Gielow,W., Martin,R., Hamilton,E. and Fowler,A. TITLEThe organization of the araBAD operon of Escherichia coli JOURNALGene 47 (2-3), 231-244 (1986) PUBMED 3549454 REFERENCE 99 AUTHORSSekiguchi,T., Ortega-Cesena,J., Nosoh,Y., Ohashi,S., Tsuda,K. andKanaya,S. TITLE DNA and amino-acid sequences of 3-isopropylmalatedehydrogenase of Bacillus coagulans. Comparison with the enzymes ofSaccharomycescerevisiae and Thermus thermophilus JOURNALBiochim. Biophys. Acta 867, 36-44 (1986) REFERENCE 100 AUTHORSChong,P., Hui,I., Loo,T. and Gillam,S. TITLEStructural analysis of a new GC-specific insertion element IS186 JOURNALFEBS Lett. 192 (1), 47-52 (1985) PUBMED 2996940 REFERENCE 101 AUTHORSIcho,T., Sparrow,C.P. and Raetz,C.R. TITLEMolecular cloning and sequencing of the gene for CDP- diglyceridesynthetase of Escherichia coli JOURNALJ. Biol. Chem. 260 (22), 12078-12083 (1985) PUBMED 2995358 REFERENCE 102AUTHORS Nomura,T., Aiba,H. and Ishihama,A. TITLETranscriptional organization of the convergent overlapping dnaQ-rnhgenes of Escherichia coli JOURNALJ. Biol. Chem. 260 (11), 7122-7125 (1985) PUBMED 2987244 REFERENCE 103AUTHORS Kamio,Y., Lin,C.K., Regue,M. and Wu,H.C. TITLECharacterization of the ileS-lsp operon in Escherichia coli.Identification of an open reading frame upstream of the ileS geneand potential promoter(s) for the ileS-lsp operon JOURNALJ. Biol. Chem. 260 (9), 5616-5620 (1985) PUBMED 2985604 REFERENCE 104AUTHORS Cowing,D.W., Bardwell,J.C., Craig,E.A., Woolford,C.,Hendrix, R.W. and Gross,C.A. TITLEConsensus sequence for Escherichia coli heat shock gene promotersJOURNAL Proc. Natl. Acad. Sol. U.S.A. 82 (9), 2679-2683 (1985) PUBMED3887408 REFERENCE 105 AUTHORSBroome-Smith,J.K., Edelman,A., Yousif,S. and Spratt,B.G. TITLEThe nucleotide sequences of the ponA and ponB genes encodingpenicillin-binding protein 1A and 1B of Escherichia coli K12 JOURNALEur. J. Biochem. 147 (2), 437-446 (1985) PUBMED 3882429 REFERENCE 106AUTHORS Becerril,B., Valle,F., Merino,E., Riba,L. and Bolivar,F. TITLERepetitive extragenic palindromic (REP) sequences in theEscherichia coli gdhA gene JOURNAL Gene 37 (1-3), 53-62 (1985) PUBMED3902576 REFERENCE 107 AUTHORSFriedberg,D., Rosenthal,E.R., Jones,J.W. and Calvo,J.M. TITLECharacterization of the 3′ end of the leucine operon ofSalmonella typhimurium JOURNAL Mol. Gen. Genet. 199 (3), 486-494 (1985)PUBMED 2993799 REFERENCE 108 AUTHORSBouvier,J., Richaud,C., Richaud,F., Patte,J.C. and Stragier,P. TITLENucleotide sequence and expression of the Escherichia coli dapB geneJOURNAL J. Biol. Chem. 259 (23), 14829-14834 (1984) PUBMED 6094578REFERENCE 109 AUTHORSRichaud,C., Richaud,F., Martin,C., Haziza,C. and Patte,J.C. TITLERegulation of expression and nucleotide sequence of the Escherichiacoli dapD gene JOURNAL J. Biol. Chem. 259 (23), 14824-14828 (1984)PUBMED 6094577 REFERENCE 110 AUTHORS Nuesch,J. and Schumperli,D. TITLEStructural and functional organization of the gpt gene region ofEscherichia coli JOURNAL Gene 32 (1-2), 243-249 (1984) PUBMED 6397401REFERENCE 111 AUTHORSJagadeeswaran,P., Ashman,C.R., Roberts,S. and Langenberg,J. TITLENucleotide sequence and analysis of deletion mutants of theEscherichia coli gpt gene in plasmid pSV2 gpt JOURNALGene 31 (1-3), 309-313 (1984) PUBMED 6396164 REFERENCE 112 AUTHORSDeutch,A.H., Rushlow,K.E. and Smith,C.J. TITLEAnalysis of the Escherichia coli proBA locus by DNA and proteinsequencing JOURNAL Nucleic Acids Res. 12 (15), 6337-6355 (1984) PUBMED6089111 REFERENCE 113 AUTHORS Bouvier,J., Patte,J.C. and Stragier,P.TITLE Multiple regulatory signals in the control region of theEscherichia coli carAB operon JOURNALProc. Natl. Acad. Sci. U.S.A. 81 (13), 4139-4143 (1984) PUBMED 6377309REFERENCE 114 AUTHORSInnis,M.A., Tokunaga,M., Williams,M.E., Loranger,J.M., Chang,S.Y.,Chang,S. and Wu,H.C. TITLENucleotide sequence of the Escherichia coli prolipoprotein signalpeptidase (lsp) gene JOURNALProc. Natl. Acad. Sci. U.S.A. 81 (12), 3708-3712 (1984) PUBMED 6374664REFERENCE 115 AUTHORS Bardwell,J.C. and Craig,E.A. TITLEMajor heat shock gene of Drosophila and the Escherichia coliheat-inducible dnaK gene are homologous JOURNALProc. Natl. Acad. Sci. U.S.A. 81 (3), 848-852 (1984) PUBMED 6322174REFERENCE 116 AUTHORS Pratt,D. and Subramani,S. TITLENucleotide sequence of the Escherichia coli xanthine-guaninephosphoribosyl transferase gene JOURNALNucleic Acids Res. 11 (24), 8817-8823 (1983) PUBMED 6324103 REFERENCE117 AUTHORS Richardson,K.K., Fostel,J. and Skopek,T.R. TITLENucleotide sequence of the xanthine guanine phosphoribosyltransferase gene of E. coli JOURNALNucleic Acids Res. 11 (24), 8809-8816 (1983) PUBMED 6324102 REFERENCE118 AUTHORS Parsot,C., Cossart,P., Saint-Girons,I. and Cohen,G.N. TITLENucleotide sequence of thrC and of the transcription terminationregion of the threonine operon in Escherichia coli K12 JOURNALNucleic Acids Res. 11 (21), 7331-7345 (1983) PUBMED 6316258 REFERENCE119 AUTHORS Stephens,P.E., Lewis,H.M., Darlison,M.G. and Guest,J.R.TITLE Nucleotide sequence of the lipoamide dehydrogenase gene ofEscherichia coli K12 JOURNAL Eur. J. Biochem. 135 (3), 519-527 (1983)PUBMED 6352260 REFERENCE 120 AUTHORSStephens,P.E., Darlison,M.G., Lewis,H.M. and Guest,J.R. TITLEThe pyruvate dehydrogenase complex of Escherichia coli K12.Nucleotide sequence encoding the dihydrolipoamide acetyltransferasecomponent JOURNAL Eur. J. Biochem. 133 (3), 481-489 (1983) PUBMED6345153 REFERENCE 121 AUTHORSStephens,P.E., Darlison,M.G., Lewis,H.M. and Guest,J.R. TITLEThe pyruvate dehydrogenase complex of Escherichia coli K12.Nucleotide sequence encoding the pyruvate dehydrogenase componentJOURNAL Eur. J. Biochem. 133 (1), 155-162 (1983) PUBMED 6343085REFERENCE 122 AUTHORS Kanaya,S. and Crouch,R.J. TITLELow levels of RNase H activity in Escherichia coli FB2 rnh resultfrom a single-base change in the structural gene of RNase H JOURNALJ. Bacteriol. 154 (2), 1021-1026 (1983) PUBMED 6302075 REFERENCE 123AUTHORS Overbeeke,N., Bergmans,H., van Mansfeld,F. and Lugtenberg,B.TITLE Complete nucleotide sequence of phoE, the structural gene for thephosphate limitation inducible outer membrane pore protein ofEscherichia coli K12 JOURNAL J. Mol. Biol. 163 (4), 513-532 (1983)PUBMED 6341601 REFERENCE 124 AUTHORSGilson,E., Nikaido,H. and Hofnung,M. TITLESequence of the malK gene in E.coli K12 JOURNALNucleic Acids Res. 10 (22), 7449-7458 (1982) PUBMED 6296778 REFERENCE125 AUTHORS Stoner,C.M. and Schleif,R. TITLEIs the amino acid but not the nucleotide sequence of theEscherichia coli araC gene conserved? JOURNALJ. Mol. Biol. 154 (4), 649-652 (1982) PUBMED 6283093 REFERENCE 126AUTHORS An,G., Bendiak,D.S., Mamelak,L.A. and Friesen,J.D. TITLEOrganization and nucleotide sequence of a new ribosomal operon inEscherichia coli containing the genes for ribosomal protein S2 andelongation factor Ts JOURNAL Nucleic Acids Res. 9 (16), 4163-4172 (1981)PUBMED 6272196 REFERENCE 127 AUTHORS Mackie,G.A. TITLENucleotide sequence of the gene for ribosomal protein S20 and itsflanking regions JOURNAL J. Biol. Chem. 256 (15), 8177-8182 (1981)PUBMED 6267039 REFERENCE 128 AUTHORSLittle,J.W., Mount,D.W. and Yanisch-Perron,C.R. TITLEPurified lexA protein is a repressor of the recA and lexA genes JOURNALProc. Natl. Acad. Sci. U.S.A. 78 (7), 4199-4203 (1981) PUBMED 7027255REFERENCE 129 AUTHORS Mulligan,R.C. and Berg,P. TITLEFactors governing the expression of a bacterial gene in mammalian cellsJOURNAL Mol. Cell. Biol. 1 (5), 449-459 (1981) PUBMED 6100966 REFERENCE130 AUTHORS Lee,N.L., Gielow,W.O. and Wallace,R.G. TITLEMechanism of araC autoregulation and the domains of two overlappingpromoters, Pc and PBAD, in the L-arabinose regulatory region ofEscherichia coli JOURNALProc. Natl. Acad. Sci. U.S.A. 78 (2), 752-756 (1981) PUBMED 6262769REFERENCE 131 AUTHORS Cossart,P., Katinka,M. and Yaniv,M. TITLENucleotide sequence of the thrB gene of E. coli, and its twoadjacent regions; the thrAB and thrBC junctions JOURNALNucleic Acids Res. 9 (2), 339-347 (1981) PUBMED 6259626 REFERENCE 132AUTHORS Miyada,C.G., Horwitz,A.H., Cass,L.G., Timko,J. and Wilcox,G.TITLE DNA sequence of the araC regulatory gene from Escherichia coli B/rJOURNAL Nucleic Acids Res. 8 (22), 5267-5274 (1980) PUBMED 7008027REFERENCE 133 AUTHORSKatinka,M., Cossart,P., Sibilli,L., Saint-Girons,I.,Chalvignac,M.A., Le Bras,G., Cohen,G.N. and Yaniv,M. TITLENucleotide sequence of the thrA gene of Escherichia coli JOURNALProc. Natl. Acad. Sci. U.S.A. 77 (10), 5730-5733 (1980) PUBMED 7003595REFERENCE 134 AUTHORSOgden,S., Haggerty,D., Stoner,C.M., Kolodrubetz,D. and Schleif,R. TITLEThe Escherichia coli L-arabinose operon: binding sites of theregulatory proteins and a mechanism of positive and negative regulationJOURNAL Proc. Natl. Acad. Sci. U.S.A. 77 (6), 3346-3350 (1980) PUBMED6251457 REFERENCE 135 AUTHORS Smith,D.R. and Calvo,J.M. TITLENucleotide sequence of the E coli gene coding for dihydrofolatereductase JOURNAL Nucleic Acids Res. 8 (10), 2255-2274 (1980) PUBMED6159575 REFERENCE 136 AUTHORS Johnsrud,L. TITLEDNA sequence of the transposable element IS1 JOURNALMol. Gen. Genet. 169 (2), 213-218 (1979) PUBMED 375010 REFERENCE 137AUTHORS Smith,B.R. and Schleif,R. TITLENucleotide sequence of the L-arabinose regulatory region ofEscherichia coli K12 JOURNAL J. Biol. Chem. 253 (19), 6931-6933 (1978)PUBMED 357433 REFERENCE 138 AUTHORSGreenfield,L., Boone,T. and Wilcox,G. TITLEDNA sequence of the araBAD promoter in Escherichia coli B/r JOURNALProc. Natl. Acad. Sci. U.S.A. 75 (10), 4724-4728 (1978) PUBMED 368797REFERENCE 139 AUTHORS Young,R.A. and Steitz,J.A. TITLEComplementary sequences 1700 nucleotides apart form a ribonucleaseIII cleavage site in Escherichia coli ribosomal precursor RNA JOURNALProc. Natl. Acad. Sci. U.S.A. 75 (8), 3593-3597 (1978) PUBMED 358189REFERENCE 140 AUTHORS Ohtsubo,H. and Ohtsubo,E. TITLENucleotide sequence of an insertion element, IS1 JOURNALProc. Natl. Acad. Sci. U.S.A. 75 (2), 615-619 (1978) PUBMED 273224REFERENCE 141 AUTHORSMusso,R., Di Lauro,R., Rosenberg,M. and de Crombrugghe,B. TITLENucleotide sequence of the operator-promoter region of thegalactose operon of Escherichia coli JOURNALProc. Natl. Acad. Sci. U.S.A. 74 (1), 106-110 (1977) PUBMED 319453REFERENCE 142 (bases 1 to 4646332) CONSRTM NCBI Genome Project TITLEDirect Submission JOURNALSubmitted (10-NOV-2005) National Center for BiotechnologyInformation, NIH, Bethesda, MD 20894, USA REFERENCE143 (bases 1 to 4646332) AUTHORS Mori,H., Horiuchi,T. and Hirai,A. TITLEDirect Submission JOURNALSubmitted (22-AUG-2005) Hirotada Mori, Graduate School ofBiological Sciences, Nara Institute of Science and Technology;8916-5 Takayama, Ikoma, Nara 630-0101, Japan(E-mail:hmori@gtc.naist.jp, Tel:81-743-72-5660, Fax:81-743-72- 5669)COMMENT PROVISIONAL REFSEQ: This record has not yet been subject tofinal NCBI review. The reference sequence was derived from AP009048.COMPLETENESS: full length. FEATURES Location/Qualifiers sourcecomplement(<1..>5861)/organism=″Escherichia coli str. K-12 substr. W3110″/mol_type=″genomic DNA″ /strain=″K-12″ /sub_strain=″W3110″/db_xref=″taxon:316407″ gene complement(<1..6) /gene=″dcuD″ CDScomplement(<1..6) /gene=″dcuD″ /note=″ECK3216:JW3196:b3227″/codon_start=1 /transl_table=11 /product=″predicted transporter″/protein_id=″AP 003769.1″ /db_xref=″GI:89109989″ (SEQ ID NO: 35)/translation=″MFGIIISVIVLITMGYLILKNYKPQVVLAAAGIFLMMCGVWLGFGGVLDPTKSSGYLIVDIYNEILRMLSNRIAGLGLSIMAVGGYARYMERIGASRAMVSLLSRPLKLIRSPYIILSATYVIGQIMAQFITSASGLGMLLMVTLFPTLVSLGVSRLSAVAVIATTMSIEWGILETNSIFAAQVAGMKIATYFFHYQLPVASCVIISVAISHFFVQRAFDKKDKNINHEQAEQKALDNVPPLYYAILPVMPLILMLGSLFLAHVGLMQSELHLVVVMLLSLTVTMFVEFFRKHNLRETMDDVQAFFDGMGTQFANVVTLVVAGEIFAKGLTTIGTVDAVIRGAEHSGLGGIGVMIIMALVIAICAIVMGSGNAPFMSFASLIPNIAAGLHVPAVVMIMPMHFATTLARAVSPITAVVVVTSGIAGVSPFAVVKRTAIPMAVGFVVNMIAT ITLFY″ primer330..348 /label=″ck nanR3 control primer″ gene 386..1177 /gene=″nanR″CDS 386..1177 /gene=″nanR″ /note=″ECK3215:JW3195:b3226″ /codon_start=1/transl_table=11 /product=″DNA-binding transcriptional dual regulator″/protein_id=″AP 003768.1″ /db_xref=″GI:89109988″ (SEQ ID NO: 36)/translation=″MGLMNAFDSQTEDSSPAIGRNLRSRPLARKKLSEMVEEELEQMIRRREFGEGEQLPSERELMAFFNVGRPSVREALAALKRKGLVQINNGERARVSRPSADTIIGELSGMAKDFLSHPGGIAHFEQLRLFFESSLVRYAAEHATDEQIDLLAKALEINSQSLDNNAAFIRSDVDFHRVLAEIPGNPIFMAIHVALLDWLIAARPTVTDQALHEHNNVSYQQHIAIVDAIRRHDPDEADRALQSHLNSVSATWHAFGQTTNKKK″ primer 1005..1025/label=″nanR ck2 control primer″ primer 1126..1146/label=″nanAFck control primer″ promoter 1178..1278/label=″nan operon promoter region″ Site 1187..1191/site_type=″binding site″ /label=″CAP binding″ Site 1198..1202/site_type=″binding site″ /label=″CAP binding″ promoter 1241..1246/label=-10 primer_bind 1252..1301/note=″for dnanA:: or dnanATE::scar deletions″/label=″H1-dnanA lambda red primer″ mRNA 1255 /label=+1 mRNA 1267/label=+13 mRNA 1279 /label=+25 gene 1299..2192 /gene=″nanA″ CDS1299..2192 /gene=″nanA″ /note=″ECK3214:JW3194:b3225″ /codon_start=1/transl_table=11 /product=″N-acetylneuraminate lyase″/protein_id=″AP 003767.1″ /db_xref=″GI:89109987″ (SEQ ID NO: 37)/translation=″MATNLRGVMAALLTPFDQQQALDKASLRRLVQFNIQQGIDGLYVGGSTGEAFVQSLSEREQVLEIVAEEAKGKIKLIAHVGCVSTAESQQLAASAKRYGFDAVSAVTPFYYPFSFEEHCDHYRAIIDSADGLPMVVYNIPALSGVKLTLDQINTLVTLPGVGALKQTSGDLYQMEQIRREHPDLVLYNGYDEIFASGLLAGADGGIGSTYNIMGWRYQGIVKALKEGDIQTAQKLQTECNKVIDLLIKTGVFRGLKTVLHYMDVVSVPLCRKPFGPVDEKYLPELKALAQQLMQERG″ Region 1302..4424 /label=″DELETION nanATE″primer_bind complement(2175..2224) /label=″H2-dnanA lambda red primer″gene 2301..3791 /gene=″nanT″ CDS 2301..3791 /gene=″nanT″/note=″ECK3213:JW3193:b3224″ /codon_start=1 /transl_table=11/product=″sialic acid transporter″ /protein_id=″AP 003766.1″/db_xref=″GI:89109986″ (SEQ ID NO: 38)/translation=″MSTTTQNIPWYRHLNRAQWRAFSAAWLGYLLDGFDFVLIALVLTEVQGEFGLTTVQAASLISAAFISRWFGGLMLGAMGDRYGRRLAMVTSIVLFSAGTLACGFAPGYITMFIARLVIGMGMAGEYGSSATYVIESWPKHLRNKASGFLISGFSVGAVVAAQVYSLVVPVWGWRALFFIGILPIIFALWLRKNIPEAEDWKEKHAGKAPVRTMVDILYRGEHRIANIVMTLAAATALWFCFAGNLQNAAIVAVLGLLCAAIFISFMVQSAGKRWPTGVMLMVVVLFAFLYSWPIQALLPTYLKTDLAYNPHTVANVLFFSGFGAAVGCCVGGFLGDWLGTRKAYVCSLLASQLLIIPVFAIGGANVWVLGLLLFFQQMLGQGIAGILPKLIGGYFDTDQRAAGLGFTYNVGALGGALAPIIGALIAQRLDLGTALASLSFSLTFVVILLIGLDMPSRVQRWLRPEALRTHDAIDGKPFSGAVPFGSAKNDLVKTKS″ primercomplement(2329..2350) /label=″nanARck control primer″ primer_bind3792..3841 /label=″H1-dnanE lambda red primer″ gene 3839..4528/gene=″nanE″ CDS 3839..4528 /gene=″nanE″ /note=″ECK3212:JW3192:b3223″/codon_start=1 /transl_table=11/product=″predicted N-acetylmannosamine-6-P epimerase″/protein_id=″AP 003765.1″ /db_xref=″GI:89109985″ (SEQ ID NO: 39)/translation=″MSLLAQLDQKIAANGGLIVSCQPVPDSPLDKPEIVAAMALAAEQAGAVAIRIEGVANLQATRAVVSVPIIGIVKRDLEDSPVRITAYIEDVDALAQAGADIIAIDGTDRPRPVPVETLLARIHHHGLLAMTDCSTPEDGLACQKLGAEIIGTTLSGYTTPETPEEPDLALVKTLSDAGCRVIAEGRYNTPAQAADAMRHGAWAVTVGSAITRLEHICQ WYNTAMKKAVL″primer_bind complement(4425..4474) /note=″for dnanATE::scar deletion″/label=″H2-dnanE lambda red primer″ RBS 4425..4448/label=″C-terminal gibberish peptide fused to KD13 scar peptide″ RBS4449..4451 /label=″NEW STOP gibberish peptide after resolutionof cassette″ primer_bind 4486..4530 /label=″nanK-H1 lambda red primer″RBS 4515..4520 /label=″nanK RBS″ gene 4525..5400 /gene=″nanK″ CDS4525..5400 /gene=″nanK″ /note=″ECK3211:JW5538:b3222″ /codon_start=1/transl_table=11 /product=″predicted N-acetylmannosamine kinase″/protein_id=″AP 003764.1″ /db_xref=″GI:89109984″ (SEQ ID NO:40/translation=″MTTLAIDIGGTKLAAALIGADGQIRDRRELPTPASQTPEALRDALSALVSPLQAHAQRVAIASTGIIRDGSLLALNPHNLGGLLHFPLVKTLEQLTNLPTIAINDAQAAAWAEFQALDGDITDMVFITVSTGVGGGVVSGCKLLTGPGGLAGHIGHTLADPHGPVCGCGRTGCVEAIASGRGIAAAAQGELAGADAKTIFTRAGQGDEQAQQLIHRSARTLARLIADIKATTDCQCVVVGGSVGLAEGYLALVETYLAQEPAAFHVDLLAAHYRHDAGLLGAALLAQGEKL″ RBS 4526..4528 /label=″Native Stop for NanE″ primercomplement(5065..5083) /label=″nanKckl control primer″ primer_bindcomplement(5380..5424) /label=″nanK-H2 lambda red primer″ gene5397..5861 /gene=″yhcH″ CDS 5397..5861 /gene=″yhcH″/note=″ECK3210:JW3190:b3221″ /codon_start=1 /transl_table=11/product=″hypothetical protein″ /protein_id=″AP 003763.1″/db_xref=″GI:89109983″ (SEQ ID NO: 41)/translation=″MMMGEVQSLPSAGLHPALQDALTLALAARPQEKAPGRYELQGDNIFMNVMTFNTQSPVEKKAELHEQYIDIQLLLNGEERILFGMAGTARQCEEFHHEDDYQLCSTIDNEQAIILKPGMFAVFMPGEPHKPGCVVGEPGEIKKVVVKVKADLMA″ ORIGIN(SEQ ID NO: 42)    1GAACATTGTT GAACTCCGTG TCAAAAGAAA ACGGTCAATC CCATAAACGG CAGATTGAAA   61ACAACGATGT TATATTTTTT GCAAGGCTAT TTATGGTGCG GATGTCGTGT TTTTAATTGT  121AGGTGAGGTG ATTTTTCATT AAAAAATATG CGCTTATGAT TATTTTGTAA GAACACATTC  181ATAATATTCA TAATGCTCGT GAATAGTCTT ATAAATAATT CAAACGGGAT GTTTTTATCT  241GCGTTACATT AATTTTTCGC AATAGTTAAT TATTCCGTTA ATTATGGTAA TGATGAGGCA  301CAAAGAGAAA ACCCTGCCAT TTTCCCCTAC TTTCAATCCT GTGATAGGAT GTCACTGATG  361ATGTTAATCA CACTGACCTT ACAGAATGGG CCTTATGAAC GCATTTGATT CGCAAACCGA  421AGATTCTTCA CCTGCAATTG GTCGCAACTT GCGTAGCCGC CCGCTGGCGC GTAAAAAACT  481CTCCGAAATG GTGGAAGAAG AGCTGGAACA GATGATCCGC CGTCGTGAAT TTGGCGAAGG  541TGAACAATTA CCGTCTGAAC GCGAACTGAT GGCGTTCTTT AACGTCGGGC GTCCTTCGGT  601GCGTGAAGCG CTGGCAGCGT TAAAACGCAA AGGTCTGGTG CAAATAAACA ACGGCGAACG  661CGCTCGCGTC TCGCGTCCTT CTGCGGACAC TATCATCGGT GAGCTTTCCG GCATGGCGAA  721AGATTTCCTT TCTCATCCCG GTGGGATTGC CCATTTCGAA CAATTACGTC TGTTCTTTGA  781ATCCAGTCTG GTGCGCTATG CGGCTGAACA TGCCACCGAT GAGCAAATCG ATTTGCTGGC  841AAAAGCACTG GAAATCAACA GTCAGTCGCT GGATAACAAC GCGGCATTCA TTCGTTCAGA  901CGTTGATTTC CACCGCGTGC TGGCGGAGAT CCCCGGTAAC CCAATCTTCA TGGCGATCCA  961CGTTGCCCTG CTCGACTGGC TTATTGCCGC ACGCCCAACG GTTACCGATC AGGCACTGCA 1021CGAACATAAC AACGTTAGTT ATCAACAGCA TATTGCGATC GTTGATGCGA TCCGCCGTCA 1081TGATCCTGAC GAAGCCGATC GTGCGTTGCA ATCGCATCTC AACAGCGTCT CTGCTACCTG 1141GCACGCTTTC GGTCAGACCA CCAACAAAAA GAAATAATGC CACTTTAGTG AAGCAGATCG 1201CATTATAAGC TTTCTGTATG GGGTGTTGCT TAATTGATCT GGTATAACAG GTATAAAGGT 1261ATATCGTTTA TCAGACAAGC ATCACTTCAG AGGTATTTAT GGCAACGAAT TTACGTGGCG 1321TAATGGCTGC ACTCCTGACT CCTTTTGACC AACAACAAGC ACTGGATAAA GCGAGTCTGC 1381GTCGCCTGGT TCAGTTCAAT ATTCAGCAGG GCATCGACGG TTTATACGTG GGTGGTTCGA 1441CCGGCGAGGC CTTTGTACAA AGCCTTTCCG AGCGTGAACA GGTACTGGAA ATCGTCGCCG 1501AAGAGGCGAA AGGTAAGATT AAACTCATCG CCCACGTCGG TTGCGTCAGC ACCGCCGAAA 1561GCCAACAACT TGCGGCATCG GCTAAACGTT ATGGCTTCGA TGCCGTCTCC GCCGTCACGC 1621CGTTCTACTA TCCTTTCAGC TTTGAAGAAC ACTGCGATCA CTATCGGGCA ATTATTGATT 1681CGGCGGATGG TTTGCCGATG GTGGTGTACA ACATTCCAGC CCTGAGTGGG GTAAAACTGA 1741CCCTGGATCA GATCAACACA CTTGTTACAT TGCCTGGCGT AGGTGCGCTG AAACAGACCT 1801CTGGCGATCT CTATCAGATG GAGCAGATCC GTCGTGAACA TCCTGATCTT GTGCTCTATA 1861ACGGTTACGA CGAAATCTTC GCCTCTGGTC TGCTGGCGGG CGCTGATGGT GGTATCGGCA 1921GTACCTACAA CATCATGGGC TGGCGCTATC AGGGGATCGT TAAGGCGCTG AAAGAAGGCG 1981ATATCCAGAC CGCGCAGAAA CTGCAAACTG AATGCAATAA AGTCATTGAT TTACTGATCA 2041AAACGGGCGT ATTCCGCGGC CTGAAAACTG TCCTCCATTA TATGGATGTC GTTTCTGTGC 2101CGCTGTGCCG CAAACCGTTT GGACCGGTAG ATGAAAAATA TCTGCCAGAA CTGAAGGCGC 2161TGGCCCAGCA GTTGATGCAA GAGCGCGGGT GAGTTGTTTC CCCTCGCTCG CCCCTACCGG 2221GTGAGGGGAA ATAAACGCAT CTGTACCCTA CAATTTTCAT ACCAAAGCGT GTGGGCATCG 2281CCCACCGCGG GAGACTCACA ATGAGTACTA CAACCCAGAA TATCCCGTGG TATCGCCATC 2341TCAACCGTGC ACAATGGCGC GCATTTTCCG CTGCCTGGTT GGGATATCTG CTTGACGGTT 2401TTGATTTCGT TTTAATCGCC CTGGTACTCA CCGAAGTACA AGGTGAATTC GGGCTGACGA 2461CGGTGCAGGC GGCAAGTCTG ATCTCTGCAG CCTTTATCTC TCGCTGGTTC GGCGGCCTGA 2521TGCTCGGCGC TATGGGTGAC CGCTACGGGC GTCGTCTGGC AATGGTCACC AGCATCGTTC 2581TCTTCTCGGC CGGGACGCTG GCCTGCGGCT TTGCGCCAGG CTACATCACC ATGTTTATCG 2641CTCGTCTGGT CATCGGCATG GGGATGGCGG GTGAATACGG TTCCAGCGCC ACCTATGTCA 2701TTGAAAGCTG GCCAAAACAT CTGCGTAACA AAGCCAGTGG TTTTTTGATT TCAGGCTTCT 2761CTGTGGGGGC CGTCGTTGCC GCTCAGGTCT ATAGCCTGGT GGTTCCGGTC TGGGGCTGGC 2821GTGCGCTGTT CTTTATCGGC ATTTTGCCAA TCATCTTTGC TCTCTGGCTG CGTAAAAACA 2881TCCCGGAAGC GGAAGACTGG AAAGAGAAAC ACGCAGGTAA AGCACCAGTA CGCACAATGG 2941TGGATATTCT CTACCGTGGT GAACATCGCA TTGCCAATAT CGTAATGACA CTGGCGGCGG 3001CTACTGCGCT GTGGTTCTGC TTCGCCGGTA ACCTGCAAAA TGCCGCGATC GTCGCTGTTC 3061TTGGGCTGTT ATGCGCCGCA ATCTTTATCA GCTTTATGGT GCAGAGTGCA GGCAAACGCT 3121GGCCAACGGG CGTAATGCTG ATGGTGGTCG TGTTGTTTGC TTTCCTCTAC TCATGGCCGA 3181TTCAGGCGCT GCTGCCAACG TATCTGAAAA CCGATCTGGC TTATAACCCG CATACTGTAG 3241CCAATGTGCT GTTCTTTAGT GGCTTTGGCG CGGCGGTGGG ATGCTGCGTA GGTGGCTTCC 3301TCGGTGACTG GCTGGGAACC CGCAAAGCGT ACGTTTGTAG CCTGCTGGCC TCGCAGCTGC 3361TGATTATTCC GGTATTTGCG ATTGGCGGCG CAAACGTCTG GGTGCTCGGT CTGTTACTGT 3421TCTTCCAGCA AATGCTTGGA CAAGGGATCG CCGGGATCTT ACCAAAACTG ATTGGCGGTT 3481ATTTCGATAC CGACCAGCGT GCAGCGGGCC TGGGCTTTAC CTACAACGTT GGCGCATTGG 3541GCGGTGCACT GGCCCCAATC ATCGGCGCGT TGATCGCTCA ACGTCTGGAT CTGGGTACTG 3601CGCTGGCATC GCTCTCGTTC AGTCTGACGT TCGTGGTGAT CCTGCTGATT GGGCTGGATA 3661TGCCTTCTCG CGTTCAGCGT TGGTTGCGCC CGGAAGCGTT GCGTACTCAT GACGCTATCG 3721ACGGTAAACC ATTCAGCGGT GCCGTGCCGT TTGGCAGCGC CAAAAACGAT TTAGTCAAAA 3781CCAAAAGTTA ATCCTGTTGC CCGGTCTATG TACCGGGCCT TTCGCTAAGG GAAGATGTAT 3841GTCGTTACTT GCACAACTGG ATCAAAAAAT CGCTGCTAAC GGTGGCCTGA TTGTCTCCTG 3901CCAGCCGGTT CCGGACAGCC CGCTCGATAA ACCCGAAATC GTCGCCGCCA TGGCATTAGC 3961GGCAGAACAG GCGGGCGCGG TTGCCATTCG CATTGAAGGT GTGGCAAATC TGCAAGCCAC 4021GCGTGCGGTG GTGAGCGTGC CGATTATTGG AATTGTGAAA CGCGATCTGG AGGATTCTCC 4081GGTACGCATC ACGGCCTATA TTGAAGATGT TGATGCGCTG GCGCAGGCGG GCGCGGACAT 4141TATCGCCATT GACGGCACCG ACCGCCCGCG TCCGGTGCCT GTTGAAACGC TGCTGGCACG 4201TATTCACCAT CACGGTTTAC TGGCGATGAC CGACTGCTCA ACGCCGGAAG ACGGCCTGGC 4261ATGCCAAAAG CTGGGAGCCG AAATTATTGG CACTACGCTT TCTGGCTATA CCACGCCTGA 4321AACGCCAGAA GAGCCGGATC TGGCGCTGGT GAAAACGTTG AGCGACGCCG GATGTCGGGT 4381GATTGCCGAA GGGCGTTACA ACACGCCTGC TCAGGCGGCG GATGCGATGC GCCACGGCGC 4441GTGGGCGGTG ACGGTCGGTT CTGCAATCAC GCGTCTTGAG CACATTTGTC AGTGGTACAA 4501CACAGCGATG AAAAAGGCGG TGCTATGACC ACACTGGCGA TTGATATCGG CGGTACTAAA 4561CTTGCCGCCG CGCTGATTGG CGCTGACGGG CAGATCCGCG ATCGTCGTGA ACTTCCTACG 4621CCAGCCAGCC AGACACCAGA AGCCTTGCGT GATGCCTTAT CCGCATTAGT CTCTCCGTTG 4681CAAGCTCATG CGCAGCGGGT TGCCATCGCT TCGACCGGGA TAATCCGTGA CGGCAGCTTG 4741CTGGCGCTTA ATCCGCATAA TCTTGGTGGA TTGCTACACT TTCCGTTAGT CAAAACGCTG 4801GAACAACTTA CCAATTTGCC GACCATTGCC ATTAACGACG CGCAGGCCGC AGCATGGGCG 4861GAGTTTCAGG CGCTGGATGG CGATATAACC GATATGGTCT TTATCACCGT TTCCACCGGC 4921GTTGGCGGCG GTGTAGTGAG CGGCTGCAAA CTGCTTACCG GCCCTGGCGG TCTGGCGGGG 4981CATATCGGGC ATACGCTTGC CGATCCACAC GGCCCAGTCT GCGGCTGTGG ACGCACAGGT 5041TGCGTGGAAG CGATTGCTTC TGGTCGCGGC ATTGCAGCGG CAGCGCAGGG GGAGTTGGCT 5101GGCGCGGATG CGAAAACTAT TTTCACGCGC GCCGGGCAGG GTGACGAGCA GGCGCAGCAG 5161CTGATTCACC GCTCCGCACG TACGCTTGCA AGGCTGATCG CTGATATTAA AGCCACAACT 5221GATTGCCAGT GCGTGGTGGT CGGTGGCAGC GTTGGTCTGG CAGAAGGGTA TCTGGCGCTG 5281GTGGAAACGT ATCTGGCGCA GGAGCCAGCG GCATTTCATG TTGATTTACT GGCGGCGCAT 5341TACCGCCATG ATGCAGGTTT ACTTGGGGCT GCGCTGTTGG CCCAGGGAGA AAAATTATGA 5401TGATGGGTGA AGTACAGTCA TTACCGTCTG CTGGGTTACA TCCTGCGTTA CAGGACGCGT 5461TAACGCTGGC ATTAGCTGCC AGACCGCAAG AAAAAGCGCC GGGTCGTTAC GAATTACAGG 5521GCGACAATAT CTTTATGAAT GTCATGACGT TTAACACTCA ATCGCCCGTC GAGAAAAAAG 5581CGGAATTGCA CGAGCAATAC ATTGATATCC AGCTGTTATT AAACGGTGAG GAACGGATTC 5641TGTTTGGCAT GGCAGGCACT GCGCGTCAGT GTGAAGAGTT CCACCATGAG GATGATTATC 5701AGCTTTGCAG CACCATTGAT AACGAGCAAG CCATCATCTT AAAACCGGGA ATGTTCGCCG 5761TGTTTATGCC AGGTGAACCG CATAAACCAG GATGCGTTGT CGGCGAGCCT GGAGAGATTA 5821AAAAGGTTGT GGTGAAGGTT AAGGCTGATT TAATGGCTTA A  //

Other Embodiments

While the invention has been described in conjunction with the detaileddescription thereof, the foregoing description is intended to illustrateand not limit the scope of the invention, which is defined by the scopeof the appended claims. Other aspects, advantages, and modifications arewithin the scope of the following claims.

The patent and scientific literature referred to herein establishes theknowledge that is available to those with skill in the art. All UnitedStates patents and published or unpublished United States patentapplications cited herein are incorporated by reference. All publishedforeign patents and patent applications cited herein are herebyincorporated by reference. Genbank and NCBI submissions indicated byaccession number cited herein are hereby incorporated by reference. Allother published references, documents, manuscripts and scientificliterature cited herein are hereby incorporated by reference.

While this invention has been particularly shown and described withreferences to preferred embodiments thereof, it will be understood bythose skilled in the art that various changes in form and details may bemade therein without departing from the scope of the inventionencompassed by the appended claims.

What is claimed is:
 1. A method for producing a sialylatedoligosaccharide in an Escherichia coli (E. coli) bacterium, wherein saidsialylated oligosaccharide comprises 3′-sialyllactose (3′-SL) or6′-sialyllactose (6′-SL), said method comprising: (i) providing an E.coli bacterium, said bacterium comprising an exogenoussialyl-transferase comprising an α(2,3) sialyl-transferase, an α(2,6)sialyl-transferase, or an α(2,8) sialyltransferase, a mutation in anendogenous N-acetylneuraminate lyase gene (nanA), wherein said bacteriumcomprises an endogenous N-acetylmannosamine kinase gene (nanK) that isnot mutated, an increased UDP-GlcNAc production capability comprisingoverexpression of nagC, such that the bacterium produces at least 10%more UDP-GlcNAc than a native E. coli bacterium, a sialic acid syntheticcapability, and a functional lactose permease gene, and (ii) culturingsaid bacterium in the presence of lactose.
 2. The method of claim 1,wherein said bacterium comprises a null mutation in any one of the genesselected from endogenous N-acetylneuraminate lyase gene (nanA),endogenous N-acetylmannosamine-6-phosphate epimerase gene (nanE), andendogenous N-acetylneuraminic acid transporter gene (nanT), or anycombination thereof.
 3. The method of claim 1, wherein said bacteriumcomprises a null mutation in an endogenous N-acetylneuraminate lyasegene (nanA).
 4. The method of claim 1, wherein said bacterium comprisesan endogenous N-acetylmannosamine-6-phosphate epimerase gene (nanE) thatis not mutated, and (i) a null mutation in the endogenousN-acetylneuraminate lyase gene (nanA), (ii) a null mutation in anendogenous N-acetylneuraminic acid transporter gene (nanT), or (iii) anull mutation in the endogenous N-acetylneuraminate lyase gene (nanA)and a null mutation in the endogenous N-acetylneuraminic acidtransporter gene (nanT).
 5. The method of claim 1, wherein saidbacterium comprises a null mutation in endogenous N-acetylneuraminatelyase gene (nanA), and a null mutation in endogenousN-acetylmannosamine-6-phosphate epimerase gene (nanE).
 6. The method ofclaim 1, wherein said sialic acid synthetic capability comprises anexogenous CMP-Neu5Ac synthetase gene (neuA), an exogenous sialic acidsynthase gene (neuB), and an exogenous UDP-GlcNac 2-epimerase (neuC). 7.The method of claim 1, wherein said α(2,3) sialyl-transferase, α(2,6)sialyl-transferase, or α(2,8) sialyltransferase, comprises a sequence ofa Photobacterium sp. sialyl-transferase, Campylobacter jejunisialyl-transferase, Neisseria meningitides sialyl-transferase, orNeisseria gonorrhoeae sialyl-transferase.
 8. The method claim 1, whereinsaid sialylated oligosaccharide comprises 6′ sialyllactose (6′-SL). 9.The method of claim 1, wherein said bacterium comprises a deleted orinactivated endogenous β-galactosidase gene.
 10. The method of claim 9,wherein said deleted or inactivated β-galactosidase gene comprises an E.coli lacZ gene.
 11. The method of claim 1, wherein said bacteriumcomprises a recombinant β-galactosidase gene providing a level ofβ-galactosidase activity between 0.05 and 200 units.
 12. The method ofclaim 1, wherein said bacterium further comprises a deleted,inactivated, or mutated lacA gene.
 13. The method of claim 1, whereinsaid E. coli bacterium comprises an increased UDP-GlcNAc productioncapability, such that it produces at least 20% more UDP-GlcNAc than anative E. coli bacterium.
 14. The method of claim 1, wherein saidincreased UDP-GlcNAc production capability further comprisesoverexpression of a glmS gene, a glmY gene, a glmZ gene or anycombination thereof.
 15. The method of claim 1, wherein said increasedUDP-GlcNAc production capability comprises overexpression of nagC andglmS.
 16. The method of claim 1, wherein said increased UDP-GlcNAcproduction capability comprises overexpression of nagC and glmY.
 17. Themethod of claim 1, wherein said increased UDP-GlcNAc productioncapability comprises overexpression of nagC and glmZ.
 18. A method ofpurifying a sialylated oligosaccharide produced by the method of claim1, comprising binding said sialylated oligosaccharide from a bacterialcell lysate or bacterial cell culture supernatant of said bacterium to acarbon column, and eluting said sialylated oligosaccharide from saidcolumn.
 19. A purified sialylated oligosaccharide produced by the methodof claim
 1. 20. The method of claim 1, further comprising retrievingsaid sialylated oligosaccharide from said bacterium or from a culturesupernatant of said bacterium.
 21. The method of claim 1, wherein saidbacterium comprises a mutation in an endogenousN-acetylmannosamine-6-phosphate epimerase gene (nanE).
 22. The method ofclaim 1, wherein said bacterium comprises an endogenousN-acetylneuraminic acid transporter gene (nanT) gene that is notmutated.
 23. The method of claim 1, wherein said sialylatedoligosaccharide comprises 3′-sialyllactose (3′-SL).
 24. The method ofclaim 1, wherein said mutation is within the coding region of nanA. 25.The method of claim 24, wherein the mutation comprises an amino aciddeletion or insertion.
 26. The method of claim 25, wherein the mutationcauses a loss of function of a nanA gene product or loss of productionof a nanA gene product.